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The  purposes  of  the  present  study  are  twofold:  the 

first  is  to  examine  the  relationship  between  background 
characteristics  of  physicians  and  the  groupings  into  which 
they  locate  in  a large  metropolitan  area  and  whether  or  not 
this  relationship  changes  over  time;  the  second  is  to  evalu- 
ate a set  of  methodologies  for  handling  data  of  indeterminate 
accuracy  over  a long  historical  period.  The  geographical 
area  in  which  the  research  was  conducted  is  the  west  side  of 
Cleveland  and  Cuyahoga  County,  Ohio,  and  the  period  under 
study  extends  from  1912  to  1969.  The  background  characteris- 
tics considered  are  those  for  which  data  are  available  in  the 
American  Medical  Association's  biennial  directories  of  physi- 
cians. .These  variables  for  each  physician  are  name,  address 
of  practice,  birthdate,  location  of  medical  school,  date  of 
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licensing  by  state  medical  board,  specialty,  and  type  of 
practice . 

The  following  initial  presumptions  were  formulated: 
Presumption  I: 

LOG  = /(YEAR,  BDATE,  LICDATE,  STATE,  SPEC) 

and 

Presumption  II: 

The  relationship  between  LOG  and  the  independent  vari- 
ables  in  the  above  equation  ha s changed  over  time . 

Using  location  as  distance  of  practice  from  the  Central 
Business  District  (CBD)  several  methodologies  were  employed 
and  evaluated.  They  include  those  that  treat  location  as  a 
univariate  measure  such  as  Automatic  Interaction  Detection  and 
regression  analysis,  those  that  consider  location  as  a multi- 
variate dependent  measure  such  as  canonical  correlation  anal- 
ysis, and  those  that  treat  location  as  a noncontinuous  or 
nominal  measure  such  as  cluster  analysis  and  discriminant 
analysis . 

Results  of  the  various  modes  of  analysis  strongly 
support  the  first  of  the  two  initial  presumptions.  Support 
for  the  second  is  less  compelling  than  that  for  the  first. 
Consideration  of  all  results  suggests: 

1.  That  of  the  background  characteristics,  date  of 
licensing  has  been  important  at  all  times  in  deter- 
mining where  physicians  locate. 

2.  That  since  1940  doctors'  locations  have  expanded 
dramatically  outward  from  the  inner  city. 
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3.  That  the  distance  from  the  central  city  of  a physi- 
cian's practice  is  a function  of  his  or  her  date 

of  licensing. 

4.  That  the  location  of  doctors'  practices  as  measured 
by  X-  and  Y-coordinate s has  become  more  complex 
since  1940  with  a new  core  of  doctors  centered 
along  the  Rocky  River. 

5.  That  techniques  which  measure  location  as  multi- 
variate give  more  importance  to  other  character- 
istics such  as  location  of  medical  school  and 
specialty  than  do  those  which  measure  location  as 
a univariate . 
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CHAPTER  I 
INTRODUCTION 


Geographic  maldistribution  of  health  manpower  represents 
one  of  the  most  serious  availability  and  access  barriers 
to  quality  health  care  in  this  nation  today.  (Paul  G. 
Rogers,  U.  S.  Congressman;  quoted  in  P.  R.  Lee  et  al . , 
1976,  xvii) 


Although  not  everyone  within  or  without  the  medical  pro- 
fession would  give  the  geographical  variable  the  importance 
of  the  above  statement,  a substantial  literature  has  appeared 
in  the  last  few  years  on  the  distribution  or  maldistribution 
of  physicians  in  the  United  States.*  This  literature  is  con- 
cerned mainly  with  the  problems  of  physician**  location  at  the 
national  or  state  level  and  urban  versus  rural  practice. 

Less  attention  has  been  directed  to-the  question  of  intra- 
urban or  metropolitan  physician-practice  location. 

An  important  overview  of  the  research  done  in  the  U.  S. 
on  the  subject  of  physician  distribution  to  that  date  is  the 
1974  work  of  Anderson  and  Marshall,  in  which  they  indicate 


*For  comprehensive  bibliographies  covering  most  of  this 
literature  see  U.  S.  Department  of  Health,  Education,  and 
Welfare  (1974),  Hart  (1975),  and  Lee,  J.  M.  et  al . (1976). 

**The  terms  doctor,  medical  practitioner,  and  physician 
are  used  interchangeably  to  denote  those  licensed  to  prac- 
tice medicine  in  Ohio  during  the  years  covered  in  the  study. 
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shortcomings  in  previous  research  and  suggest  directions  for 
future  research.  They  divide  physician-location  studies  into 
two  main  groups^  those  that  focus  on  the  individual  physician 
and  those  that  use  a structural  or  aggregate  approach.  The 
individual  approach  looks  at  factors  that  influence  the  in- 
dividual doctor  in  choosing  his  practice  location.  The 
structural  approach  suggests  that  the  "aggregated  character- 
istics of  individuals,  such  as  averages  or  rates,  or  charac- 
teristics of  the  group's  organization,  such  as  the  presence 
of  a medical  school  or  number  of  hospital  beds,"  (Anderson 
and  Marshall,  1974,  196),  are  among  the  factors  that  deter- 

mine the  aggregate  locational  characteristics  of  physicians 
in  a community.  Such  an  approach  reflects  the  sociological 
orientation  of  the  authors  and  the  direction  of  much  of  the 
previous  research  in  medical  care  availability.  However,  a 
more  comprehensive  understanding  of  the  processes  that  deter- 
mine ultimately  the  location  of  individual  medical  practi- 
tioners results  from  the  use  of  aspects  of  both  approaches. 

One  of  the  earliest  examinations  of  physician  location 
using  a combination  of  approaches  is  that  of  Schneider  (1967) , 
a part  of  a larger  study  of  medical  facilities  in  Cincinnati 
in  relation  to  patient  demand.  Schneider  employed  statisti- 
cal techniques  to  demonstrate  an  interaction  between  hospi- 
tals, physician-practice  locations,  and  physicians'  residen- 
ces. However  germane  to  geographic  research  on  health  facil- 
ities Schneider's  work  was,  his  section  on  physician  location 
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was  notable  mainly  for  demonstrating  the  need  for  further 
research  in  this  area. 

In  1973  Donald  Dewey  published  a monograph  as  part  of 
the  Chicago  Regional  Hospital  Study  in  which  he  attacked  the 
problem  of  intra-urban  or  metropolitan  physician  location. 
Dewey,  together  with  other  staff  on  the  Chicago  Regional  Hos- 
pital Study,  used  cartographic  analysis  plus  centrographic 
techniques  to  analyze  the  relationship  between  distribution 
o f phy s icians  and  popul a tion  in  the  Chicago  SMSA  during  the 
period  from  1950  to  1970.  Detailed  maps  were  derived  showing 
the  patterns  of  physicians,  total  population,  Negro  popula- 
tion, and  socioeconomic  status  as  measured  primarily  by  re- 
tail sales.  The  areal  units  of  the  study  were  either  Commu- 
nity Areas  (CA) , derived  from  the  work  of  Burgess  in  the 
1920s,  or  Health  Care  Areas  (HCA)  as  defined  by  de  Vise  in 
the  1960s  (Dewey,  1973,  5-6).  Each  numbered  around  200  for 

that  part  of  the  Chicago  SMSA  which  is  contained  in  Illinois. 
Dewey,  himself,  summarized  his  findings  as  follows; 

There  has  been  a relative  de  dine  in  Chicago ' s 
proportion  of  the  nation's  doctors  as  well  as  an  ab- 
solute decrease  in  the  number  of  private  practitioners, 
while  the  population  was  growing  rapidly.  This  has  re- 
sulted in  a severe  decline  in  the  ratio  of  private  prac- 
tice physicians  to  population  in  the  Chicago  area. 

Physicians  are  decentralizing  out  of  the  City  and 
into  the  Suburbs.  More  specifically,  they  are  leaving 
those  areas  which  are  changing  from  high  to  low  socio- 
economic status  and  those  whose  racial  structures  are 
changing  from  white  to  black,  and  they  are  going  to  the 
more  affluent  areas  of  the  Northern  and  Western  Suburbs. 
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The  disparity  in  the  distribution  of  physicians' 
offices  is  the  antithesis  of  the  appropriate  distri- 
bution based  on  need  as  indicated  by  morbidity. 

Specialists'  offices  display  an  even  greater  con- 
centration in  affluent  areas  than  all  physicians'  of- 
fices. Hence,  the  distribution  of  specialists'  offices 
is  even  more  irregular  and  these  doctors  are  less 
available  to  the  poor  of  the  Metropolitan  Area. 

The  trend  toward  increased  specialization  in  the 
medical  profession  will  result  in  a more  inequitable 
distribution  of  physicians'  offices  in  the  future. 

Finally,  the  so-called  doctor  shortage  in  the  cen- 
tral cities  of  large  metropolitan  areas,  with  high  con- 
centrations of  poverty,  could  be  partially  alleviated 
by  the  redistribution  of  the  available  doctors. 

(Dewey,  1973,  151)  [Emphases  are  Dewey's] 

Chicago,  because  of  its  large  size,  its  particular  mix 

of  industry  and  commerce,  and  perhaps  because  of  its  ethnic 

and  racial  mix,  is  not  archetypal  of  United  States'  SMSAs. 

If,  however,  Dewey's  conclusions  are  reaffirmed  by  similar 

research  in  other  cities,  the  implications  for  health  care  in 

the  U.  S.  are  grave;  physician  maldistribution  is  increasing 

and  delivery  to  those  most  in  need  is  decreasing. 

Dewey's  work  is  qualified  not  only  by  the  uniqueness  of 

Chicago  but  also  by  problems  within  his  methodology.  These 

include,  firstly,  the  fact  that  a large  amount  of  data  was 

reduced  to  a few  summary  measures.  Secondly,  although  he  is 

one  of  the  few  researchers  to  examine  physician  location  over 

a time  span,  the  period  that  he  used--from  1950  to  1970--is 

relatively  short  and  trends  apparent  within  it  may  or  may 

not  be  part  of  long-term  trends. 

Several  recent  dissertations  and  theses  in  geography, 

(Schultz,  1971;  Guptill,  1974;  Bryan,  1975;  and  Hart,  1975), 

have  looked  at  the  distribution  of  U.  S.  physicians  in 
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specific  metropolitan  areas.  Of  these  studies  only  those  of 
Schultz  and  Hart  incorporated  change  over  time  as  part  of 
their  analyses.  Schultz's  work  (also  reported  in  Schultz, 
1975)  used  a space  potential  model  to  analyze  the  relation- 
ship between  physician  location  and  population  or  income  po- 
tentials. Hart's  work,  on  the  other  hand,  incorporated  some 
elements  of  the  individual  approach  in  his  study.  Neverthe- 
less, and  despite  the  fact  that  Hart  examined  structural 
changes  over  time,  he  did  not  attempt  to  assess  changes  at 
the  individual  level  over  time. 

Other  recent  studies  of  intra-urban  physician  location 
also  have  concentrated  on  the  structural  side  of  the  loca- 
tion-decision processes  or  have  looked  only  at  one  particu- 
lar instant  in  time.  These  studies  include  Dorsey  (1969), 
Fine  (1971),  Elesh  and  Schollaert  (1972),  Kaplan  and 
Leinhardt  (1973)  , Guzick  and  Jahiell  (1976)  , Barnett  and 
Barnett  (1977)  , and  Barnett  (1978)  . 

It  is  to  the  problems  raised  by  the  foregoing  studies 
that  the  present  research--by  looking  at  another  U.  S. 
metropolitan  area,  by  using  multivariate  analytical  tech- 
niques, and  by  employing  a time  span  that  covers  roughly 
two  generations — addresses  itself. 

Much  of  the  work,  including  that  of  Dewey,  concerned 
with  physician-practice  location  within  metropolitan  areas 
has  been  structural.  At  larger  regional,  state,  and  national 
levels  of  investigation  the  emphasis  has  been  primarily  upon 
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the  impact  of  individual  characteristics.  In  part  the  pre- 
dominance of  sociological  interest  in  many  of  these  studies 
at  the  metropolitan  level  may  contribute  to  this  apparent 
dichotomy . 

A more  appropriate  and  effective  research  strategy  at 
the  intra-urban  level  would  incorporate  elements  of  both  ap- 
proaches. These  two  approaches  are,  in  fact,  complementary - 
Each  urban  area  is  a set  of  neighborhoods;  within  each  neigh- 
borhood are  people  and  services.  Among  the  services  is  the 
distribution  of  medical  care  involving,  for  example,  physi- 
cians, druggists,  community  health  centers,  hospitals,  and 
nursing  homes.  These  neighborhoods  represent  the  accumula- 
tion of  location  decisions  made  over  the  entire  history  of 
the "particular  city. 

When  a physician  decides  to  locate  or  relocate  in  a 
given  urban  area,  he  then  must  choose  among  the  neighbor- 
hoods or  subcommunities  comprising  the  area.  In  so  doing,  his 
individual  goals  must  be  measured  against  attributes  of  each 
neighborhood.  The  individual  location  decision  will  be  ef- 
fected within  this  matrix  of  neighborhood  (aggregate)  attri- 
butes. In  order  to  best  understand  that  decision,  it  is  ne- 
cessary to  gain  insight  into  both  its  determinants;  the  doc- 
tor's desires,  and  the  neighborhood's  characteristics.  To 
date  research  has  paid  scant  attention  to  the  doctor's  pre- 
ferences in  intra-urban  physician  location.  As  a result  we 
fail  to  obtain  comprehensive  insight  into  the  physician  mal- 
distribution problem. 
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Because  physicians  in  the  United  States  are  relatively 
free  to  practice  where  they  wish,  the  characteristics  of  the 
subcommunities  in  which  they  choose  to  practice  assume  an 
additional  emphasis  in  the  decision-making  process.  The 
physician  in  most  urban  areas  has  had  few  constraints  placed 
upon  the  practice-location  decision  because  of  the  high  de- 
mand and  controlled  supply  of  similar  services  for  the  past 
sixty  or  seventy  years.  Because  of  generally  high  demand 
for  medical  care  and  low  supply,  a physician  has  been  rela- 
tively certain  of  earning  an  attractive  income  in  almost  any 
location  he  chooses  in  the  urban  area.  And  because  of  this 
virtually  assured  financial  security,  he  has  little  trouble 
in  obtaining  finance  at  any  location  he  chooses.  In  addition, 
because  of  the  public's  desire  for  medical  services  and  the 
innoxious  quality  of  most  physicians'  offices,  he  normally 
is  subject  to  relatively  few  zoning  controls.  And  even  where 
zoning  controls  have  been  nominally  strict,  the  physician  has 
often  been  able  to  obtain  zoning  exceptions  for  his  desired 
office  location  (Kaplan  and  Leinhardt,  1973). 

Purposes  of  Study 

One  of  the  principal  purposes  of  this  study,  then,  is 
the  examination  of  the  background  to  the  location  decisions 
of  physicians  in  part  of  a large  metropolitan  area,  Cleveland, 
Ohio,  over  an  extended  period  of  time,  the  years  1910  to  1970, 
in  order  to  better  understand  the  changing  distribution  of 
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physicians  within  U.  S.  urban  areas  and  to  give  a greater  in- 
sight into  the  personal  location-decision  process  itself. 

The  other  primary  purpose  is  to  explore  methodologies  that 
may  contribute  to  understanding  decision-influencing  factors 
in  a historical  context.  Essentially,  this  implies  an  in- 
vestigation and  evaluation  of  techniques  which  can  cope  with 
large  amounts  of  data  of  indeterminate  accuracy. 

Choice  of  Study  Location  (Figure  1) 

The  Cleveland  metropolitan  area  was  chosen  for  the  fol- 
lowing reasons; 

1.  Familiarity  resulting  from  earlier  research  under- 
taken with  Gary  Shannon  (Shannon  et  al . , 1973; 

Shannon  et  al.,  1975)  . 

2.  The  requirement  of  a study  area  with  a fairly  long 
and  typical  history  of  development  and  of  suffici- 
ently large  size  for  the  development  of  multiple 
and  identifiable  neighborhoods  or  subcommunities. 

The  choice  of  Cleveland  on  the  basis  of  the  above  cri- 
teria proved  to  be  fortuitous  for  an  additional  reason: 

3.  The  enormous  quantity  and  extraordinary  quality 

of  population,  housing,  and  service  data  collected 
mainly  in  the  1935  to  1955  period  by  agencies  dir- 
ected by  Howard  Whipple  Green  for  the  Cleveland 
metropolitan  area  (see  for  example.  Green,  1934; 


Green,  1951a;  Green,  1951b;  and  Green,  1952) . The 
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most  important  of  these  agencies  was  the  Real  Pro- 
perty Inventory  of  Metropolitan  Cleveland  under 
whose  aegis  much  of  the  data  was  published.  The 
scope  and  completeness  of  Green's  data  have  been 
only  occasionally  exploited.  The  major  effort  is 
the  work  of  Stouffer  (1940)  which  led  to  the 
"Theory  of  Intervening  Opportunity"  that  attempted 
to  explain  the  motivation  behind  migration  patterns. 
More  recently,  some  of  Green's  data  have  been  used 
by  Guest  (1972)  in  his  examination  of  urban  popula- 
tion density  patterns  and  by  Kusmer  (1976)  in  look- 
ing at  the  formation  of  black  ghettos. 

Within  the  Cleveland  area  it  was  necessary  to  restrict 
further  the  study  to  western  Cleveland  and  Cuyahoga  County 
for  the  following  reasons: 

1.  A desire  to  remove  the  effects  of  race  on  physician 
location  in  order  to  better  understand  first  other 
variable  impacts.  This  should  not  be  interpreted 
as  a denial  of  the  importance  of  race,  but  rather 
as  an  attempt  to  establish  a base  that  will  later 
permit  a better  understanding  of  the  effects  of  the 
racial  composition  of  neighborhoods  on  physician 
maldistribution  (see  Elesh  and  Schollaert,  1972, 
for  a study  of  the  effects  of  race) . The  study 
area  has  remained  almost  totally  white  during  the 
study  period  (Kusmer,  1976)  . 
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2.  The  desire  to  look  at  the  relationship  between 
residential  neighborhoods  and  the  physicians  they 
attract.  Therefore,  the  Central  Business  District 
(CBD)  was  deleted  from  the  study  area  on  the  as- 
sumption that  at  least  for  the  latter  part  of  the 
time  span  under  consideration  doctors  in  the  CBD 
are  interested  in  serving  all  or  widely  scattered 
parts  of  the  urban  area  rather  than  some  particular 
urban  neighborhood.  The  east  side  of  Cleveland  was 
also  eliminated  because  it  contains  a large  complex 
of  hospitals  and  related  medical  institutions  that 
services  not  only  the  whole  of  the  Cleveland  area 
but  other  parts  of  the  nation  as  well.  The  Cleve- 
land Clinic,  located  in  this  complex,  is  one  of  the 
major  research  hospitals  in  the  country.  The  pres- 
ence of  the  Cleveland  Metropolitan  General  Hospital, 
the  major  public  hospital  in  Cleveland,  within  the 
study  area,  should  have  little  effect  on  the  physi- 
cians in  private  practice  who  are  the  subject  of 
this  study,  because  most  of  the  patients  that  the 
hospital  attracts  from  outside  the  region  use  the 
outpatient  facilities  (Green,  1938;  Finley,  1963)  . 

3.  An  unsuccessful  attempt  was  made  to  match  popula- 
tion growth  and  physician  growth  using  Morrill's 
wave  models  (Morrill,  1970)  . This  necessitated 


the  use  of  a sector  in  order  to  isolate  in  two 
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FIGURE  1 

WESTERN  AREA  OF  CUYAHOGA  COUNTY,  OHIO,  SHOWING  THE  STUDY  AREA 
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dimensions  the  relationship  between  the  peaks  of 
the  population  and  physician-migration  waves. 

4.  The  presence  of  a broad  spectrum  of  income,  educa- 
tion, and  ethnic  (but  not  racial)  types  within 
the  defined  study  area. 

Organization  of  the  Study 

This  study  is  divided  into  three  major  sections.  The 
first  is  the  introduction  and  background  to  the  research. 

The  second  section  consists  of  the  development  and 
testing  of  the  several  working  hypotheses  about  physician 
location.  Chapter  III  consists  of  the  development  of  the 
hypotheses  and  Chapter  IV  contains  the  tests  using,  firstly, 
the  automatic  Interaction  Detection  (AID)  method  developed 
in  the  Institute  for  Social  Research  at  the  University  of 
Michigan  and  later  extensively  used  in  marketing  research 
(P.  E.  Green,  1978).  This  technique  is  especially  appropri- 
ate where  theory  is  little  developed  and  research  consists 
basically  of  looking  for  the  factors  and  associations  that 
may  be  relevant.  The  title  of  the  introductory  work  on  AID, 
Searchin2_  for  Structure  , indicates  ^ that , as  in  the  present 
case,  understanding  the  structurei'of  a research  problem  is 
often  more  beneficial  than  its  solution. 

On  the  basis  of  the  results  obtained  from  AID,  several 
additional  research  methods  were  employed  including  regres- 
sion analysis  in  conjunction  with  principal  components 


13 


analysis  and  discriminant  analysis  in  conjunction  with  clus- 
ter analysis.  The  results  of  these  operations  are  also  in- 
cluded in  the  second  section. 

The  third  section.  Chapter  VII,  consists  of  the  summary 
which  restates  and  evaluates  the  findings  in  light  of  pre- 
vious physician-practice  research.  Suggestions  are  made  for 
further  research  in  the  area  of  physician  location. 


CHAPTER  II 

CHANGING  STRUCTURE  AND  ORGANIZATION  OF  PRIVATE 
PHYSICIAN  PRACTICE  IN  THE  U.  S.,  1850-1970 

The  present  study  of  private  physicians  for  reasons  of 
data  availability  begins  in  approximately  1910.  The  years 
around  the  turn  of  the  century  also  witnessed  profound  chan- 
ges in  the  practice  of  medicine  in  the  U.  S.  and  established 
patterns  of  training  and  conduct  that  continue  to  the  pres- 
ent. These  years,  from  1910  to  1970,  are  considered  by  many 
to  be  the  "golden  years"  of  private  medical  practice  in 
America . * 

To  understand  how  these  "golden  years"  evolved,  it  is 
necessary  to  examine  the  history  of  private  medical  practice 
in  the  U.  S.,  a history  that  begins  with  the  .country  itself. 
The  Declaration  of  Independence  included  three  physicians  as 
signers  and  there  were  a number  of  physicians  among  the  dele 
gates  to  the  Constitutional  Convention  of  1787.  Benjamin 
Rush  of  Philadelphia,  a signer  of  the  Declaration  of  Indepen 
dence,  was  perhaps  the  most  famous  of  these  early  medical 
practitioners  having  enormous  influence  on  medical  practice 
in  this  country  and  abroad. 


*The  following  discussion  is  a condensation  of  many 
sources.  Special  note  should  be  made  of  Stevens's  (1971) 
lengthy  history  of  American  medical  practitioners. 
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However,  it  was  not  until  1847,  when  the  American  Medi- 
cal Association  (AMA)  was  first  organized,  that  there  was  a 
substantial  move  toward  standardization  of  medical  practice. 
Moreover,  the  AMA  grew  slowly  and  represented  a minority  of 
the  private  physicians  until  well  after  the  turn  of  the  cen- 
tury . 

Before  the  founding  of  the  AMA  and  during  its  formative 
years,  the  training  received  by  the  majority  of  medical  prac- 
titioners was  very  poor.  In  the  early  part  of  the  nineteenth 
century  medical  training  was  pr imar il y taken  as  an  appren- 
ticeship to  a practicing  doctor.  The  latter  1800s  saw  the 
establishment  of  the  medical  school  system.  However,  the 
majority  of  medical  schools  were  proprietary  and  few  pos- 
sessed even  the  most  basic  scientific  facilities.  They  were, 
in  most  cases,  not  connected  with  hospitals  and  medical  stu- 
dents received  no  practical  training.  Learning  came  largely 
from,  lectures  supplemented  occasionally  by  heading. 

Basic  to  present  conceptions  of  good  health  care  is  the 
ratio  of  medical  practitioners  to  population.  In  1900  there 
were  about  124,000  physicians  in  the  U.  S.,  or  a ratio  of  one 
doctor  to  every  600  persons  (in  1970  the  ratio  was  one  to 
645)  (Stevens,  1971,  420) . But  in  many  cases  these  doctors 
had  little  to  offer  their  patients  other  than  advice,  prac- 
tical or  otherwise.  The  scientific  revolution  had  not  yet 
affected  private  medical  practice  in  this  country. 
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Nevertheless,  the  scientific  revolution  had  begun  to 
transform  medicine.  Pasteur,  Koch,  Lister,  and  other  early 
bacteriologists  were  providing  the  first  explanation  of  the 
transmission  of  contagious  disease  by  bacteria  in  the  1860s 
and  1870s.  With  this  understanding  it  was  only  a short  step 
to  the  large-scale  development  of  immunology. 

Notwithstanding  these  advances,  the  practice  of  private 
medicine  was  still  held  in  relatively  low  repute  (Shryock, 
1947;  Stevens,  1971) . This  was  at  least  partially  due  to 
the  fact  that  the  aforementioned  developments  were  in  the 
areas  of  preventive  medicine.  Most  of  these  measures  were 
best  applied  en_  masse  by  public  health  authorities.  There 
was  still  relatively  little  need  for  the  average  person  in 
the  U.  S.  to  turn  to  a private  practitioner  for  care  in  the 
event  of  illness.  In  fact,  as  noted  by  one  writer  of  the 
time  : 

Before  about  1915,  it  has  been  suggested,  the 
average  person  had  little  more  than  a 50-50  chance  of 
benefiting  from  an  encounter  with  the  average  doctor. 
(Stevens,  1971,  135:  Attributed  to  Lawrence  J. 

Henderson  by  Alan  Gregg  in  Chal 1 e nge  to  Contemporary 
Medicine . New  York,  1956,  p.  13) 

At  the  end  of  the  nineteenth  century,  there  occurred 
developments  which  were  to  change  fundamentally  private  medi- 
cal care.  These  included  the  development  of  an  antitoxin 
against  diphtheria  by  Behring  in  Berlin  and  Roux  in  Paris  in 
the  early  1890s  and  the  discovery  by  Ehrlich,  also  in  Berlin, 
in  1902  of  the  specificity  with  which  certain  substances 
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attack  disease-causing  organisms  in  animals  and  humans. 

Ehrlich,  after  eight  years  of  research  with  606  substances, 
found  one,  salvarsan,  that  destroyed  the  syphilis-causing 
agent,  treponema  palladium,  and  led  to  almost  immediate  cure 
of  the  disease  in  both  rabbits  and  man. 

Thus,  through  immunization  and  specific  drugs,  medical 
practitioners,  for  the  first  time,  had  within  their  profes- 
sional armament  comparatively  "sure-fire"  methods  for  disease 
care.  As  long  as  physicians  were  able  to  maintain  control  of 
this  armament,  they  could  be  assured  of  a steady  clientele. 

And  from  1900  onward  the  medical  profession,  principally 
through  the  efforts  of  the  AMA , has  attempted  and  in  most 
cases  succeeded  in  maintaining  its  control. 

Throughout  the  nineteenth  century  the  loosely  organized 
AMA  had  relatively  little  political  or  economic  power.  Be- 
ginning in  1902  the  AMA  underwent  substantial  organizational 
change.  Membership  requirements  were  tightened  and  stipula- 
tions were  made  to  assure  that  control  of  the  AMA  would  re- 
main solely  within  the  organized  medical  profession.  County 
and  state  medical  societies  were  strengthened,  but  district 
medical  societies  were  discouraged.  These  moves  ensured  a 
hierarchical  structure  within  the  AMA  itself  and  also  that 
there  would  be  no  structure  to  rival  the  national  organization. 

The  bolstered  governing  council  of  the  AMA  began  moves 
to  improve  the  standard  of  medical  education  in  the  U.  S. 
Coincident  was  the  privately  sponsored  study  of  medical 
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schools  headed  by  Abraham  Flexner  which  appeared  in  published 
form  as  Medical  Education  in  the  United  States  and  Canada  in 
1910.  Flexner  and  his  group  examined  medical  schools  in  the 
U.  S.  and  of  the  155  then  operating  found  only  three.  Harvard, 
Johns  Hopkins,  and  Western  Reserve,  to  be  satisfactory.  The 
report  recommended  that  124  of  the  155  medical  schools  be 
closed  permanently.  Each  of  the  remaining  schools  should  be 
attached  to  a university--eff actively  eliminating  propri- 
etary schools--and  should  have  no  more  than  seventy  gradu- 
ating students  per  year. 

The  Flexner  report  generated  immense  publicity  and  com- 
ment both  within  and  without  the  medical  profession  and  the 
change,  if  not  reform,  of  medical  education  started  immedi- 
ately. By  1920  there  remained  only  eighty-five  medical 
schools  and  all  of  these  had  upgraded  substantially  their 
entrance  requirements  and  curricula.  All  now  required  at 
least  two  years  of  college  work  before  admission;  most  had 
required  none  before  the  Flexner  report.  Consequently,  the 
number  of  medical  graduates  decreased  and  their  exposure 
and  commitment  to  the  scientific  method  in  the  practice  of 
medicine  increased. 

But  the  scientific  revolution  which  spawned  the  weapons 
against  disease  that  have  benefited  much  of  mankind  and 
enriched  private  practitioners  in  the  western  world,  had 
also  set  in  motion  trends  which  have  accelerated  and 
which  ultimately  may  reduce  substantially  the  importance, 

, and  prestige  of  the  private  practitioner. 


power 


Two 
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of  these  trends  are  specialization  and  increased  reliance 
on  technology  and  its  apparatus.  These  developments  are  not 
confined  to  medical  care  but  extend  through  many  aspects  of 
western  society;  they  have  been  particularly  rapid,  however, 
in  medical  care  (Bell,  1973  ; Maxmen,  1976)  . 

The  development  of  innoculation  and  antibiotics  were  at 
least  partially  caused  by  these  two  trends.  The  people  who 
made  these  advances  were  bacteriologists  and  immunologists 
although  they  were  not  then  called  by  these  names.  And  they 
relied  on  the  technology  of  microscopes  and  chemical  analyses 
to  analyze  and  test  their  developments.  Furthermore,  they 
depended  on  hospitals  as  an  institution  to  provide  the  set- 
ting for  controlled  treatment  and  testing  of  patients. 

Before  this' hospitals  had  served  largely  as  dumping  grounds 
for  the  dying.  Increasingly,  they  would  become  the  neces- 
sary concomitant  of  the  practicing  physician;  no  other  insti- 
tution could  gather  together  the^anpower  and  expertise  to 
apply  the  new  treatments  to  the  physician's  patients.  And 
until  recently,  the  private  physician,  through  the  referral 
process,  maintained  a large  measure  of  control  over  entrance 
to  hospitals  and  thus  over  the  medical  care  system  itself. 

Centralization  and  technology  which,  since  the  turn  of 
the  century,  have  increased  the  power  and  prestige  of  pri- 
vate medical  practitioners,  now  threaten  the  control  they 
exercise.  One  such  force  is  the  federal  government  through 
recent  initiatives  such  as  Medicare  and  community  health  plan- 
ning organizations  (Stevens,  1971) . Another  is  technological 
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developments  such  as  computers  and  related  electronic  de- 
vices. Much  of  the  diagnostic  process  may  in  time  be  auto- 
mated thus  reducing  the  role  of  the  individual  physician 
(Cook,  1979;  Maxmen,  1976)  . 

If  these  forces  continue  and  the  position  of  the  pri- 
vate practitioner  declines,  then  the  years  from  1900  to  1970 
will  truly  have  been  the  "golden  years"  of  the  private  physi- 
cian in  America. 


CHAPTER  III 

DEVELOPMENT  OF  THE  RESEARCH  METHODOLOGY  AND  HYPOTHESES 

The  goals  of  the  present  research  are  twofold:  The 

first  is  to  look  at  the  relationships  between  the  background 
characteristics  of  physicians  and  the  groupings  into  which 
they  locate  in  a large  metropolitan  area;  the  second  is  to 
evaluate  a set  of  methodologies  for  handling  data  of  inde- 
terminate accuracy  over  a long  historical  period.  The  metro- 
politan area,  the  western  part  of  the  Cleveland,  Ohio,  SMSA, 
was  chosen  for  reasons  of  its  size  and  history,  the  availa- 
bility of  data,  and  familiarity  with  the  area.  This  chapter 
outlines  the  reasons  for  the  choice  of  a set  of  methods  and 
discusses  the  hypotheses  that  will  be  tested  with  these  meth- 
ods. 

Two  papers,  the  products  of  research  institutions  on  op- 
posite sides  of  the  United  States,  combine  to  form  the  theo- 
retical background  for  the  methodology  used  in  this  study. 

One  is  an  address  to  the  American  Psychological  Association 
by  John  Tukey  of  the  Bell  Telephone  Laboratories  in  New  Jer- 
sey (Tukey,  1969)  . The  second  is  the  work  of  Nicholas 
Rescher  and  Olaf  Helmer,  philosophers  at  the  Rand  Corporation 
in  California,  entitled  "On  the  Epistemology  of  the  Inexact 
Sciences,"  published  in  Management  Science  (Helmer  and 
Rescher , 1959 ) . 
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Tukey  is  usually  referred  to  as  a statistician  although 
he  considers  himself  a "data  analyst"  (Cooley  and  Lohnes, 
1971,  v)  and  the  purposes  of  data  analysis  are  the  subject 
of  his  address.  The  title,  "Analyzing  Data:  Sanctification 

or  Detective  Work,"  reveals  Tukey's  intentions;  most  research 
ers  would  prefer  to  be  called  detectives  rather  than  sancti- 
fiers. Tukey's  purpose,  however,  is  not  to  deprecate  confirm 
atory  data  analysis  or  the  customary  inferential  statistics; 
instead  it  is  to  suggest  that  exploratory  data  analysis  plays 
an  equal  role  with  confirmatory  analysis  and  that  the  re- 
searcher or  data  analyst  should  be  f lexible--even  imaginative 
in  his  explorations.  Tukey's  article  begins: 

Both  exploratory  and  confirmatory  data  analysis 
deserve  our  attention.-  Both  detection  and  adjudication 
play  crucial  roles — in  the  progress  of  science  as  in 
the  control  of  crime. 

To  concentrate  on  confirmation,  to  the  exclusion 
or  submergence  of  exploration,  is  an  obvious  mistake. 
Where  does  new  knowledge  come  from?  How  can  an  undetec- 
ted criminal  be  put  on  trial? 

Exploration  relies  greatly  on  looking  around.  . . . 

There  really  seems  to  be  no  substitute  for  "looking  at 
the  data . " 

When  we  calculate,  rather  than  looking,  we  must 
seek  the  same  flexibility. 

We  ought  to  try  to  calculate  what  will  help  us  most 
to  understand  our  data,  and  their  indications.  We  ought 
not  to  be  bound  by  preconceived  notions--or  precon- 
ceived analyses.  . . . 

There  is  no  substitute  for  examining  indications. 

We  want  to  know  what  the  data  seem  to  say,  whether  or 
not  we  can  prove  that  they  mean  it.  (Tukey,  1969,  83) 

[Emphases  are  Tukey's] 

In  his  call  for  flexibility  Tukey  has  reformulated  the 
true  goals  of  research.  But  by  eliminating  the  necessity 
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for  hard  and  fast  rules  or  formulas  for  data  analysis,  Tukey 
also  removes  the  certainties  by  which  research  results  are 
usually  judged.  This  problem  is  the  subject  of  the  paper  by 
Helmer  and  Rescher. 

Helmer  and  Rescher  addressed  their  article  to  the  social 
sciences;  however,  they  viewed  their  epistemology  as  appli- 
cable to  all  inexact  sciences,  "be  they  social  sciences  or 
[the]  'as  yet'  inexact  physical  sciences,"  (Helmer  and 
Rescher,  1959,  27) . The  essence  of  Helmer  and  Rescher's 

thesis  is  that  in  the  inexact  sciences  progress  may  be  made 
through  the  establishment  of  "quasi-laws."  In  order  for 
quasi-laws  to  be  established  as  valid,  and  more  importantly, 
useful , 

. . .it  is  not  necessary  that  no  apparent  exceptions 

occur,  it  is  only  necessary  that  if  apparent  exceptions 
occur,  an  adequate  explanation  be  forthcoming,  an  ex- 
planation demonstrating  the  exceptional  characteristic 
of  the  case  in  hand  by  establishing  the  violation  of  an 
appropriate  (if  hitherto  unformulated)  condition  of  the 
Law's  applicability.  (Helmer  and  Rescher,  1959,  29) 

Quasi-laws  are  useful  because  they  may  be  used  as  the 
basis  for  prediction,  if  not  explanation;  for  the  measure  of 
prediction  is  "only  that  it  establish  its  hypothesis  simply 
as  more  credible  than  any  comparable  alternative , " (Helmer 
and  Rescher,  1959,  32) . Hence,  a good  quasi-law  is  one  which 

consistently  produces  more  useful  predictions  than  any  com- 
parable alternative. 

Parenthetically,  it  is  interesting  to  note  that  Stouffer, 
the  originator  of  one  of  the  most  famous  "laws"  in  the  social 
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sciences,  the  "Theory  of  Intervening  Opportunity,"  explaining 
migration  flows,  made  similar  statements  as  early  as  1940; 

The  ultimate  utility  of  the  abstract  theory  will 
be  determined  by  the  variety  and  abundance  of  concrete 
situations  in  which  it  proves  helpful  in  providing  at 
least  an  initial  ordering  of  thinking  and  of  data. 

Even  when  quantitative  data  are  inadequate  or  unavail- 
able, the  theory  may  have  its  uses  in  contributing  to  a 
logical  framework  for  analyzing  tendencies . (Stouffer, 
1940,  846)  [Emphasis  is  Stouffer ' s] 

Rescher  further  developed  this  and  other  related  theses 

in  a series  of  books  of  which  two  are  particularly  important 

to  the  discussion  here.  The  first.  Plausible  Reasoning , 

(Rescher,  1976)  sets  out  a calculus  by  which  laws  may  be 

judged  plausible  or  not  on  the  basis  of  conflicting  evidence. 

Rescher  enunciates  two  rules  for  "plausible  reasoning;" 

[Rule  1;]  . . . the  plausibility-ranking  of  a plausible 

thesis  that  is  derivable  from  some  group  of  mutually 
consistent  theses  is  never  to  be  less  .than  that  of  the 
least  plausible  thesis  operative  in  the  derivation. 
(Rescher,  1976,  12) 

This  will  be  characterized  as  the  If-Worst-Comes- 
to-Wors t Rule . (Rescher,  1976,  13) 

[Rule  2; ] . . . when  there  are  alternative  routes  to  the 

same  proposition  from  several  groups  of  p-set  [set  of 
plausible  propositions]  premises,  then  the  be s t of  these 
is  determinative  of  its  plausibility.  (Rescher,  1976, 
12)  [Emphases  are  Rescher's] 

In  other  words  a thesis  is  only  as  good  as  the  weakest 
link  in  a supporting  argument  and  if  there  are  several  wit- 
nesses to  or  arguments  for  a thesis,  the  most  reliable  or 
strongest  determines  the  thesis'  plausibility  ranking. 

How  does  the  researcher  discover  theses  which  may  be 
tested  for  plausibility?  Tukey  (1969)  suggests  we  explore 
the  data , which  we  may  take  as  all  evidence,  statistical  or 


otherwise,  in  an  attempt  to  determine  what  it  means . 
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Rescher  develops  a similar  concept  in  his  book,  Methodo- 
logical Pragmatism  (1977)  . 

Rescher  contends  that  examination  of  the  evidence  will 
lead  the  researcher  to  presumptions . Presumptions  are  theses 
that  are  "avowedly  not  known  (i.  e. , known  to  be  true ) , 
but  having  some  claim--however  tentative  or  imperfect-- to 
be  regarded  as  a truth,"  (Rescher,  1977,  115)  [Emphases  are 
Rescher' s].  Rescher  describes  the  possible  development  of 
a presumption  into  a thesis: 

The  acceptance  of  a thesis  is,  to  be  sure,  a de- 
cisive act.  But  like  other  decisive  acts  (marriage, 
for  example)  one  can  take  tentative  and  indecisive  steps 
in  its  direction.  Taken  initially  on  some  slight  provi- 
sional and  probatively  insufficient  basis,  a thesis  can 
build  up  increasing  trust.  A fundamentally  economic 
analogy  holds  good  here:  a thesis,  like  a person  can 

only  acquire  a solid  credit  rating  by  being  given  credit 
(i.  e.,  some  credit)  in  the  first  place--provisionally 

and  without  any  very  solid  basis.  (Rescher,  1977  , 115); 
[Emphases  are  Rescher's] 

In  other  words  Rescher  says  firstly,  that  one  arrives 
at  a presumption  by  an  examination  of  the  evidence,  and  sec- 
ondly, one  tests  the  presumption  by  as  many  tests  as  are 
feasible  being  aware  that  the  results  often  may  be  conflic- 
ting. Then,  on  the  basis  of  the  rules  of  plausible  reasoning 
one  arrives  at  theses  or  quasi-laws.  FinallV#  these  are  then 
tested  in  the  real  world  for  usefulness.  The  ultimate  test 
of  any  thesis  or  quasi-law  is  whether  it  is  more  useful  in 
prediction  than  any  comparable  alternative. 
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Tukey  expresses  similar  views.  On  the  validation  of 
quasi-laws  by  usefulness  criteria  he  states: 

The  modern  test  of  significance,  before  which  so 
many  editors  of  psychological  journals  [not  only  psy- 
chologists] are  reported  to  bow  down,  owes  more  to  R. 

A.  Fisher  than  to  any  other  man.  Yet  Sir  Ronald's 
standard  of  firm  knowledge  was  not  one  very  extremely 
significant  result,  but  rather  the  ability  to  repeat- 
edly get  results  significant  at  5%.  . . . 

Repetition,  each  significant,  is  the  basis,  ac- 
cording to  Fisher,  of  scientific  truth.  (Tukey,  1969, 

85) 

And  on  the  issues  of  exactness  and  inexactness,  or  cer- 
tainty and  uncertainty,  Tukey  is  emphatic: 

Certainty  is  an  illusion.  We  have  only  to  look  at 
physics  over  the  last  100  years  to  see  that  this  is 
true  for  the  sciences  which  have  earned  the  greatest 
regard.  The  fact  that  "all  the  laws  of  physics  are  wrong, 
though  most  are  extremely  good  approximations"  does  not 
make  physics  less  valuable,  either  intellectually  or 
practically . 

As  an  illusion,  certainty  can  be  wasteful,  as  well 
as  misleading,  (Tukey,  1969,  85) 

and  finally: 

. . . The  search  for  certainty  can  only  lead  us  astray. 

(Tukey , 1969 , 86 ) 


Set  of  Methods 

The  above  paragraphs  paint  the  background  for  the  set  of 
methods  which  is  here  utilized  to  gain  some  insight  into  the 
changing  distribution  of  physicians  on  the  west  side  of  the 
Cleveland  metropolitan  area.  No  claim  is  made  that  the  set 
of  methods  is  original  or  that  this  particular  methodolgy 
should  be  followed  without  modification  by  other  researchers. 
The  primary  hope  is  that  it  will  alert  others  to  follow 
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Tukey's  injunction  that  the  researcher  should  look  for  what 
the  data  seem  to  say  and  should  not  be  led  astray  by  the  de- 
sire for  certainty . The  researcher  should  be  willing  to  try 
methods  both  old  and  new  and  to  try  them  over  and  over  again. 

If  varying  samples  and  methods  consistently  produce  sim- 
ilar results,  the  researcher  has  a relatively  easy  time  in- 
terpreting them.  If,  however,  the  results  of  the  varying 
tests  are  conflicting,  as  is  likely,  the  researcher  should  not 
despair  and  give  up  the  task.  Rather,  at  this  point,  he 
should  attempt  to  use  Rescher's  two  maxims  on  choosing  among 
plausible  alternatives  to  suggest  the  directions  or  theses 
that  look  most  promising  for  further  investigation.  This 
path  is  the  most  likely  to  lead  eventually  to  fruitful  re- 
sults. Admittedly,  the  "eventually"  may  prove  to  be  an  in- 
ordinately long  time  in  relation  to  the  possible  benefits  to 
be  gained;  it  is  the  obligation  of  the  researcher  to  weigh 
possible  benefits  against  likely  costs  at  each  stage  o-f  the 
research  procedure.  It  is  obvious  that  this  type  of  research 
procedure  often  will  produce  spotty  results;  it  will  fre- 
quently be  a case  of  two  steps  forward  and  at  least  one  back- 
ward. If,  on  the  other  hand,  the  researcher  chooses  to  aban- 
don or  ignore  any  research  which  produces  conflicting  results 
because  of  an  illusory  search  for  certainty,  it  will  be  a case 


of  a few  steps  backward  but  almost  none  forward. 
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The  present  study  is  concerned  with  determining  whether 
or  not  there  are  any  relationships  between  a physician's 
background  and  the  location  of  his  practice.  More  particu- 
larly, the  interest  is  in  whether  there  has  been  a change 
in  these  relationships  over  time  in  a sector  of  a large  U.  S. 
metropolitan  area.  Previous  studies  have  focused  princi- 
pally on  the  characteristics  of  the  neighborhoods  in  rela- 
tion to  the  number  of  physicians  they  attract.  The  emphasis 
here  is  on  the  characteristics  of  the  physicians  themselves 
which  influence  their  choice  of  practice  location.  Ulti- 
mately, it  is  hoped  to  combine  both  sides  of  the  location 
decision/  the  characteristics  of  neighborhoods  and  the  back- 
grounds of  the  doctors,  into  one  "equation;"  for  the  moment, 
however,  research  and  data  analysis  are  limited  to  the  lat- 
ter side  of  the  equation. 

Examination  of  "natural"  clusters  of  physicians  in  the 
research  area  and  determination  of  whether  or  not  tha  back- 
grounds of  physicians  in  these  clusters  have  varied  over 
the  sixty  years  of  the  study  provide  the  starting  points  of 
the  analysis.  Unfortunately,  this  length  of  time  makes  the 
number  of  variables  on  the  background  of  the  physicians  for 
which  data  may  be  obtained  and  which  are  counted  over  time 
very  limited.  Yet,  despite  the  limited  number  of  variables, 
the  large  number  of  observations — there  are  2006  doctors  in 
the  sample,  many  appearing  more  than  once — makes  visual  ob- 
servation and  interpretation  very  difficult.  However,  visual 
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observation  of  the  data  is  nearly  always  the  logical  starting 
point  for  any  research  problem  so  it  is  near  the  top  of  the 
following  list  of  procedures.  As  Tukey  says. 


It  may  well  be  true  that  "plot  and  eye"  is  the  most 
diverse  channel  to  the  human  mind.  Not  that  it  trans- 
mits more  bits  per  second,  but  rather  that  it  will  trans- 
mit a greater  variety  of  messages  on  unexpected  topics 
easily  and  rapidly.  (Tukey,  1969,  83) 


Table  III-I  shows  the  set  of  general  procedures  which 
should  be  useful  in  most  types  of  exploratory  social  research 


particularly  where  the  data  are  historical.  In  addition  to 


general  procedures  the  table  gives  the  specifics  of  the  pres- 


ent research  project. 


TABLE  III-I 
RESEARCH  PROCEDURES 


General  Procedures 

Specifics  of  the  Research  Project 

1.  Formulate  general 
statement  of  problem 

1.  Statement  formulated:  To  look  at  backgroimds 
of  physicians  in  relation  to  neighborhood 
characteristics  in  their  practice  locations 
over  an  extended  time  span 

2.  Place  problem  in  con-  2.  Literature,  rather  limited,  on  intra-urban 

text  by  looking  at  re-  physician  location  perused 

suits  of  similar  research 

3.  Decide  on  general  prob- 
lem approach  on  basis 
of  background  reading 

3.  Implicit  assumption  made  of  a mathematical 
model  of  the  relationship  between  neighbor- 
hood of  practice  and  physician  background 
characteristics 

4 . Find  appropriate  data 

4.  Cleveland  SMSA  selected;  data  gathered  at 
approximately  10-year  intervals  from  1912 
to  1969  from  American  Medical  Association 
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TABLE  III-l--eONTTNaED 


5 . Reexamine  problem  in  re-  5 . 
lation  to  availability 
of  data.  Approach  may  be 
altered  in  light  of  data 
availability  but  problem 
should  remain  constant 

On  the  basis  of  available  data  and  limita- 
tions in  time  and  money,  a decision  was 
made  to  concentrate  research  on  doctors  ' 
background  part  of  "equation" 

6 . Plot  data  and  look  for  6 . 

what  plots  seem  to  show 

New  doctors  for  each  year  plotted:  Almost 
impossible  to  interpret  because  of  nvunber 
and  complexity  of  patterns 

7 . Use  general  search  tech-  7 . 
niques  to  look  for  re- 
lationships 

Automatic  Interaction  Detection  (AID)  anal- 
ysis of  doctors'  locations  (i.  e.,  distance 
from  CBD)  in  relation  to  doctors ' back- 
grounds performed 

8.  Use  more  specific  and  8. 

powerful  research  tools 
to  test  for  relationship 
thought  to  be  foiind  in 
plots  and  general  search. 
Repeat  tests  several 
times  with  differing 
samples.  Plot  tests 
whenever  possible 

Factor,  regression,  cluster, -and  discrimi- 
nant analyses  used  to  test  some  of  the  re- 
lationships that  appeared  from  (hypotheses) 
the  plots  and  general  search.  Tested  for 
each  year  of  data  and  sometimes  split  sam- 
ple into  random  parts.  Produced  regression 
plots,  factor  plots,  cluster  and  discrimi- 
nant plots 

9 . Use  Rescher ' s maxims  for  9 . 
testing  between  differ- 
ing plausible  outcomes 

Rescher 's  maxims  used  to  choose  among  out- 
comes for  the  differing  analyses 

10.  Evaluate  results  in  10,  Results  evaluated  against  other  literatures 


light  of  original  prob- 
lem and  other  research 

on  intra-urban  physician  location  and  sug- 
gestions made  for  further  research 
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It  should  be  noted  that  the  above  outline  of  this  re- 
search methodology  makes  little  mention  of  formulating  hy- 
potheses and  testing  them  by  the  normal  statistical  proce- 
dures. This  is  not  to  deny  the  importance  of  this  step  but 
to  suggest  that  it  should  be  only  one  part  of  a much  more 
comprehensive  test  procedure.  The  formulation  of  statisti- 
cally testable  hypotheses  requires  that  many  assumptions 
about  the  nature  of  the  data  be  made  and  that  a result  be 
anticipated.  We  cannot  ask  the  proper  questions  unless  we 
have  some  idea  of  the  context  of  the  answers.  These  are 
unavoidable  situations  but  ones  that  should  be  made  clear 
in  a particular  research  situation. 

For  instance,  the  question,  "Is  Johnny  going  to  town  to- 
day?" would  be  a foolish  question  in  a number  of  situations; 
e.  g.,  if  there  were  no  Johnny,  or  no  town,  or  if  the  respon- 
dent had  no  knowledge  of  either,  or  if  there  were  no  reason  to 
suspect  that  Johnny  ever  went  to  town,  or  if  the  questioner 
had  no  use  for  the  information.  Similarly,  the  question, 

"Have  the  physicians'  birthplaces  made  any  difference  in  where 
doctors  locate  their  practices  in  Cleveland,  Ohio,  over  the 
last  sixty  years?"  is  foolish  if  one  has  no  way  of  finding 
out  the  doctors ' birthplaces  for  all  doctors  in  Cleveland  for 
the  last  sixty  years. 

So  too,  a question  such  as,  "Does  the  number  of  brothers 
a doctor  has  affect  his  or  her  choice  of  practice  location?" 
becomes  much  less  foolish  if  one  learns  that  doctors  who  are 
only  children  are  more  likely  to  be  specialists  than  to  be 
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G.  P.s,  although  at  present  there  is  no  known  evidence  for 
such  a statement.  This  example,  as  well  as  the  others, 
should  emphasize  that  testable  hypotheses  are  only  one  part 
of  a long  and  tedious  research  procedure  and  should  be  at 
neither  the  beginning  nor  the  end  of  that  procedure.  The 
following  discussion  of  the  actual  research  procedure  should 
illustrate  the  above  discussion  more  fully. 

Data  Sources 

This  study,  like  all  historical-statistical  studies, 
must  cope  with  three  basic  difficulties.  Firstly,  there  are 
few  historical-statistical  sources  that  cover  extended  time 
periods  in  most  areas  of  research;  secondly,  few  data  series 
are  consistent  in  format  or  use  of  units  of  measure  through 
time;  and  finally,  the  determination  of  accuracy  for  many 
historical  data  series  is  often  extremely  difficult  and  fre- 
quently impossible. 

> ■ 

Any  historical  study  of  physicians  in  the  U.  S.  must 
normally  rely  on  either  of  two  sources;  the  American  Medical 
Association's  (AMA)  biennial  directories  of  physicians  in  the 
U.  S.,  or  the  various  states'  registries  of  physicians. 
Neither  of  these  sources  is  absolutely  accurate  or  compre- 
hensive for  any  given  area  or  year. 

The  state  registries  are  frequently  out-of-date  by  the 
time  they  become  publicly  available  and  often  do  not  differ- 
entiate between  doctors  in  active  practice  and  those  who  are 


not . 


In  addition,  in  order  to  work  with  the  state  registries. 
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particularly  for  historical  data,  it  is  usually  necessary  to 
go  to  the  state  capitol  for  the  area  of  interest.  And  the 
data  gathered  by  the  states  may  be  inconsistent  from  one 
state  to  the  next. 

The  AMA  directories,  on  the  other  hand,  are  widely  avail- 
able and  contain  data  usually  no  more  than  three  years  old. 

The  AMA  also  attempts  to  differentiate  between  doctors  in  ac- 
tive practice  and  those  who  are  not.  Moreover,  the  AMA  data 
are  consistent  among  states  at  a particular  time.  Unfortu- 
nately, there  is  less  chronological  consistency  in  the  AMA 
data.  The  format  of  the  data  has  been  altered  frequently  and 
the  coding  systems  have  also  been  changed  at  various  times 
through  the  history  of  the  directories.  The  first  appeared 
in  1906  (American  Medical  Directory , 1906  to  date) . 

Despite  these  inconsistencies,  there  are  several  types 
of  data  on  physicians  for  which  it  is  possible  to  get  a rela- 
tively consistent  record  from  1906  to  the  present  day.  These 
ar e : 

1 . Name . 

2.  Address  of  practice. 

3 . Birthdate . 

4.  Date  of  licensing  by  state  medical  board. 

5.  Year  of  graduation  from  medical  school. 

6 . Specialty . 

7 . 


Type  of  practice. 
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With  considerable  effort  data  for  every  doctor  in  the 
area  of  interest,  western  Cleveland  and  Cuyahoga  County,  were 
collected  for  the  directory  years  that  most  closely  approxi- 
mate the  census  years  from  1910  to  1970.  These  years  are 
1912,  1921,  1931,  1940,  1950,  1960,  and  1969.  It  was  neces- 

sary to  do  all  the  data  extraction  by  hand--approximately 
300  to  400  man-hours  of  work.  An  effort  was  made  to  secure 
access  to  the  AMA  magnetic  tapes  from  which  the  directories 
since  1960  have  been  produced.  This  effort  was  unsuccessful. 

Data  Collection  Procedures 

In  each  year  for  which  data  were  collected  several  steps 
were  required.  These  are  detailed  below; 

1.  All  suburbs  in  western  Cuyahoga  County  at  the  parti- 
cular data  year  were  listed  and  all  the  doctors  lis- 
ted in  the  directory  in  these  localities  were  coded 

; on  data  sheets. 

2.  For  the  central  city  of  Cleveland  all  doctors'  lis- 

tings were  examined  and  all  those  who  could  possibly 
have  been  in  the  research  area  were  coded  on  the 
data  sheets.  For  doctors  with  addresses  on  numbered 
streets  this  was  relatively  simple  as  all  numbered 
streets  in  western  Cleveland  city  have  the  prefix 
"West;"  (e.  g.,  2500  West  25th  Street).  For  named 

streets  it  was  more  difficult  as  there  are  no  pre- 
fixes which  immediately  identify  them  as  being  in 


western  Cleveland  city. 


Some  streets  are  so  common 
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in  the  listings  that  they  become  quickly  known. 

For  instance,  Euclid  Avenue  and  University  Boule- 
vard are  common  doctors ' locations  in  eastern 
Cleveland  whereas  Lorain  and  Detroit  Avenues  are 
common  addresses  in  western  Cleveland.  If  there 
were  doubts  about  whether  or  not  a particular  ad- 
dress was  in  the  research  area,  it  was,  neverthe- 
less, coded  at  this  point. 

3.  All  addresses  were  then  located  on  an  arbitrary 
square-mile  grid  with  an  origin  at  Public  Square, 
the  center  of  the  Cleveland  CBD  and  origin  of 
street  addresses,  and  with  an  east-west,  north- 
south  orientation.  This  was  done  by  overlaying  the 
arbitrary  grid  on  the  Official  Street  At 1 a s o f 
Cleveland  and  Cuyahoga  County  (1973) . Occasionally 
it  was  necessary  to  refer  to  earlier  street  and  top- 

V^ographic  maps  where  there  had  occurred  street  or 

suburb  name  changes.  There  were  some  changes  in  the 
addressing  systems  in  various  suburbs  over  the  study 
period;  however,  the  address  system  in  Cleveland  city 
has  remained  the  same  with  some  minor  corrections  . 

At  this  stage  all  doctors  whose  addresses  actually 
turned  out  to  be  outside  the  study  area  were  re- 
jected. (Most  of  the  remaining  data  transformation 
and  amendment  was  done  by  computer.) 

4.  A simple  computer  program  calculated  a new  variable. 


distance  from  CBD  (DISTCBD) , from  the  X-  and 
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Y-coordinate s established  in  the  previous  step. 

5.  Once  the  data  had  been  derived  for  all  selected 
years,  several  computer  sorts  were  done  in  order 
to  perform  a number  of  checks  on  accuracy.  Checks 
on  accuracy  included  sorts  by  birthdate  and  li- 
cense date  which  then  helped  locate  doctors  who 
had  changed  names  from  one  data  year  to  the  next-- 
by  marriage  perhaps--or  doctors  for  which  there 
had  been  mistakes  in  the  spelling  of  their  names 
from  one  year  to  the  next.  While  it  is  not  pos- 
sible to  state  that  the  data  are  absolutely  ac- 
curate, they  have  been  subjected  to  numerous 
checks,  both  manually  and  by  computer,  and  it  is 
felt  that  there  is  a minimal  number  of  mistakes 
left  that  arise  from  coding  and  transcription  of 
the  data.  Inaccuracies  inherent  in  the  data  source 
remain  largely  unknown. 

6.  Other  computer  sorts  were  performed  in  order  to 
produce  two  new  variables.  Once  doctors'  names 
and  addresses  had  been  checked  and  double  checked, 
it  then  became  possible  to  sort  all  doctors  for 
all  years  by  name  and  address  and  to  produce  a new 
variable  which  specified  whether  a doctor 

a.  appeared  on  the  list  for  the  first  time, 

b.  was  on  the  list  the  previous  year  but 


at  a different  address,  or 
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c.  was  on  the  list  the  previous  year  at 
the  same  address. 

Another  variable  was  produced  in  a similar  manner 
which  indicated  whether  for  a particular  year  a 
doctor  left  the  list  in  the  succeeding  year  or 
moved  to  a new  location. 

After  the  preceding  steps,  the  data  included  a total  of 
2006  doctor-observations  divided  as  shown  in  table  III-2 
among  the  data  years. 


TABLE  III-2 

NUMBER  OF  DOCTORS  BY  DATA  YEAR 


ita  Year 

Total  Doctors 

New  Doctor 
1 

Codes* 

(NEWDR) 

3 

1912 

157 

157 

- 

- 

1921 

206 

96 

60 

50 

1931 

264 

138 

73 

53 

1940 

288 

103 

65 

120 

1950 

330 

130 

71 

129 

1960 

366 

182 

61 

123 

1969 

395 

146 

95 

154 

2006 

952 

425 

629 

*New  Doctor  Code  1 includes  all  doctors  who  appear  in  the  data  for 
the  first  time  in  a given  year. 

New  Doctor  Code  2 includes  doctors  who  appeared  previously  on  the 
list  but  at  a different  address. 

New  Doctor  Code  3 includes  doctors  who  appeared  previously  on  the 
list  at  the  same  address. 
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Statement  of  Presumptions 

An  assumption,  recorded  in  Table  III-l,  has  been  made 
that  the  location  of  doctors'  practices  is  influenced  by  the 
background  and  training  of  the  individual  doctor  interacting 
with  the  characteristics  of  the  neighborhoods  in  which  they 
choose  to  practice.  However,  the  circumstances  of  the  pres- 
ent study  have  made  it  necessary  to  concentrate  on  the  back- 
ground and  characteristics  of  the  physicians  for  the  time 
being.  It  is  planned  that  later  work  will  enable  the  cur- 
rent research  to  be  correlated  with  an  understanding  of  the 
communities  within  which  the  physicians  practice. 

For  the  present  study,  then,  the  following  initial  pre- 
sumptions were  made.  These  presumptions  were  modified  as 
the  results  of  each  step  of  the  research  program  were  inter- 
preted. 

Presumption  I; 

LOC  = f(YEAR,  SPATE,  LICDATE,  STATE,  SPEC) 

where:  LOC  is  location  of  physician  as  initially 
measured  by  distance  to  Cleveland  CBD, 

YEAR  is  year  of  data, 

BDATE  is  year  of  physician's  birth,  ^ 

LICDATE  is  date  physician  was  licensed, 

STATE  is  location  of  medical  school  from 
which  physician  graduated, 

SPEC  is  type  of  physician's  specialty. 
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Presumption  II; 

The  relationship  between  LOC  and  the 
variables  in  the  above  equation  has  change 
The  above  two  presumptions  may  be  con 
"working  hypotheses"  for  this  study  as  der 
data  and  as  qualified  by  time  and  other  re 


independent 
^ o ve r time . 
sidered  as  the 
ived  from  the 
sources  available 


CHAPTER  IV 
ANALYSIS  OF  DATA  1 

LOCATION  DESCRIBED  BY  UNIVARIATE  MEASURES 


Since  one  of  the  primary  purposes  of  this  research  is 
to  evaluate  certain  analytical  techniques  that  would  enable 
the  historical  researcher  to  best  examine  his  or  her  data 
in  order  to  learn  as  much  as  possible,  it  was  considered 
desirable  that  the  first  step  in  the  research  design  should 
look  at  the  data  in  the  most  general  and  wide-ranging  way. 

As  Tukey  noted  (1969,  83) , the  best  first  "look"  is  gener- 
ally visual  observation  where  possible.  Because  of  the 
large  total  number  of  observations  (2006)  and  the  sizable 
number  of  variables  involved,  it  is  difficult  to  examine  all 
aspects  of  the  data  visually.  Nevertheless,  certain  plots 
of  the  data  are  useful  in  indicating  some  trends  in  physi- 
cian distribution. 

For  instance.  Figures  2 through  8*  plot  the  location  of 
all  doctors  in  the  study  area  for  each  of  the  data  years 
1912  through  1969.  There  has  been  a steady  dispersal  of 
doctors  outward  from  the  inner  city  during  this  period.  As 
expected,  the  postwar  years  show  a particularly  large 


*For  base  map  of  the  area  see  Chapter  I,  Figure  1. 
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increase  in  the  number  of  doctors  located  in  outer  city  and 
suburban  areas.  The  year  1969  appears  to  show  a large 
drop-off  in  the  number  of  medical  practitioners  in  the  inner 
city  and  a large  growth  in  the  numbers  along  Lorain  Avenue 
and  in  the  nearby  shopping  areas. 

Figures  9 through  14  show  similar  plots  but  in  these 
only  doctors  who  appear  in  the  study  area  for  the  first  time 
in  a given  year  are  indicated.  Again  there  seems  to  be  a 
marked  movement  outward  in  the  location  of  new  practitioners* 
and  again  there  seems  to  be  a noticeable  jump  outward  in 
1969.  However,  contrary  to  what  might  be  expected,  new 
practitioners  do  not  appear  to  be  markedly  more  concentrated 
in  the  outer  areas  than  do  all  practitioners  for  any  given 
year  . 

In  order  to  look  closer  at  this  somewhat  surprising  re- 
sult, plots  were  also  made  of  each  of  the  other  two  classes 
of  doctors  based  on  the  NEWDR  variable.  These  two  classes 
are  Class  2 , doctors  who  were  in  the  data  in  the  previous 
year  but  at  a different  location,  and  Class  3,  doctors  who 
remained  at  the  same  location  as  in  the  previous  data  year. 
Both  these  classes  were  mapped  for  all  years,  but  only  the 
two  maps  for  1969  are  shown  in  order  to  conserve  space 
(Figures  15  and  16) . An  examination  of  Figures  14  through 


*Note  that  "new  par cti t ione r " merely  indicates  a prac- 
titioner new  to  the  study  area.  He  or  she  may  have  been  in 
practice  for  some  length  of  time  in  another  area. 
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16  indicates  a slight  tendency  for  "old  doctors " --doctors 
who  have  remained  at  their  old  place  of  practice — to  be 
more  centrally  located  than  "new  doctors"  and  "movers." 

This  suggests  that  the  steady  dispersion  of  doctors 
outward  on  the  west  side  of  the  Cleveland  SMSA  is  not  caused 
by  a simple  process  of  new  doctors  being  added  on  the  out- 
skirts as  old  doctors  cease  practicing  for  whatever  reason 
in  the  inner  city  area.  The  process  appears  to  be  rather 
more  complex  than  that.  To  form  a clearer  picture  of  the 
possible  process,  maps  were  made  showing  the  location  of  all 
practitioners  in  each  data  year  who  either  moved  to  a new 
location  by  the  following  data  year  or  who  left  the  research 
area  altogether.  These  data  were  derived  from  the  LEAVER 
variable  which  classifies  doctors  into  three  classes:  Class 

1,  doctors  who  do  not  appear  in  the  list  in  the  following 
data  year;  Class  2,  doctors  who  moved  to  new  locations  within 
the  research  area  in  the  following  data  year;  and  Class  3, 
doctors  who  remained  at  the  same  location  in  the  following 
data  year. 

Figures  17  through  19  map  the  data  for  LEAVER  Classes  1, 

2,  and  3 doctors  in  1960.  The  year  I960  must  be  used  as  an 
illustration  instead  of  1969.  Because  1969  is  the  last  data 
year,  there  are  no  LEAVER  data.  It  appears  that  Class  1 doc- 
tors are  somewhat  more  concentrated  in  the  inner  city  than 
Classes  2 and  3 doctors  although  the  distributions  for  each 
class  are  fairly  widespread. 
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Instead  of  the  simple  expansion  model  earlier  rejected, 
new  doctors  apparently  locate  in  all  parts  of  the  metropoli- 
tan area  but  have  tended  to  expand  their  distribution  out- 
ward during  the  study  period.  On  the  other  hand,  doctors 
who  ceased  practicing  have  also  been  widely  distributed 
throughout  the  study  area  with  a slightly  larger  concentra- 
tion in  the  inner  city.  These  two  slight  tendencies,  the 
tendency  for  the  distribution  of  new  doctors  to  expand  out- 
ward slowly  over  time  and  the  tendency  for  practitioners  who 
disappear  from  the  data  for  whatever  reason  to  be  somewhat 
more  numerous  in  the  inner  city,  have  produced  a long-term 
lessening  of  the  proportion  of  doctors  practicing  in  the  in- 
ner city  with  consequent  increases  in  the  number  in  the  outer 
areas  . 

These  maps  indicate  the  movements  of  new  and  old  doc- 
tors, but  they  give  little  information  concerning  the  vari- 
ables that  may  be  influencing  these  movements.  Each  of  the 
maps  has  been  produced  from  only  three  variables,  X.-coordi- 
nate,  Y-coordinate , and  class  of  doctors.  This,  of  course, 
neglects  the  other  data  on  physicians  that  have  been  collected. 

It  is  possible  to  produce  similar  maps  for  each  of  the 
other  variables,  but  this  would  result  in  another  case  of  the 
ongoing  problem  facing  geographer s--that  of  looking  for 
correspondence  in  the  patterns  among  numerous  sets  of  maps.* 


*As  an  example  of  the  possibilities,  six  maps  showing 
specialty  levels  and  location  have  been  included  for  the 
1960  data. 
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LOCATION  OF  ALL  DOCTORS  IN  STUDY  AREA  IN  1912 
NOTE:  ON  THIS  AND  EACH  OF  THE  SUCCEEDING  MAPS  ONE  DOT  EQUALS  ONE  DOCTOR 
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LOCATION  OF  ALL  DOCTORS  IN  STUDY  AREA  IN  1921 
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LOCATION  OF  ALL  DOCTORS  IN  STUDY  AREA  IN  1931 
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LOCATION  OF  ALL  DOCTORS  IN  STUDY  AREA  IN  1969 
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LOCATION  OF  NEW  DOCTORS  TO  STUDY  AREA  IN  1921 
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LOCATION  OF  NEW  DOCTORS  TO  STUDY  AREA  IN  1940 
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LOCATION  OF  NEW  DOCTORS  TO  STUDY  AREA  iN  1969 
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LOCATION  OF  DOCTORS  WHO  CHANGED  LOCATION  WITHIN  STUDY  AREA  BETWEEN  1960  AND  1969 
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LOCATION  OF  DOCTORS  WHO  REMAINED  AT  THE  SAME  LOCATION  IN  1960  AND  1969 
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1960  LOCATION  OF  DOCTORS  WHO  DO  NOT  APPEAR  IN  THE  STUDY  AREA  IN  1969 
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FIGURE  18 

1960  LOCATION  OF  DOCTORS  WHO  MOVE  TO  ANOTHER  LOCATION  WITHIN  THE  STUDY  AREA 
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LOCATION  OF  DOCTORS  WHO  REMAINED  AT  THE  SAME  LOCATION  IN  BOTH  1960  AND  1961 
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1960  LOCATION  OF  SPECIALTY  LEVEL  1 DOCTORS 
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1960  LOCATION  OF  SPECIALTY  LEVEL  2 DOCTORS 
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1960  LOCATION  OF  SPECIALTY  LEVEL  3 DOCTORS 
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FIGURE  23 

1960  LOCATION  OF  DOCTORS  WHO  WERE  TRAINED  IN  OHIO  MEDICAL  SCHOOLS 
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1960  LOCATION  OF  DOCTORS  WHO  WERE  TRAINED  IN  U.  S,  MEDICAL  SCHOOLS  OTHER  THAN  OHIO 
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FIGURE  25 

1960  LOCATION  OF  DOCTORS  WHO  WERE  TRAINED  IN  FOREIGN  MEDICAL  SCHOOLS 
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Rather,  the  large  number  of  observations  necessitates  the 
use  of  summary  statistical  techniques. 

Automatic  Interaction  Detection 

There  are  numerous  ways  to  examine  a large  amount  of 
data  involving  several  variables  for  possible  relationships. 
Visual  examination  of  the  data  through  plots  and  similar 
figures  has  already  been  discussed.  Another  "sifting"  tech- 
nique was  developed  at  the  Survey  Research  Center  of  the 
University  of  Michigan  in  the  1960s  and  1970s  (Sonquist, 

1970) . This  technique  goes  under  the  rather  cumbersome  title 
of  Automatic  Interaction  Detection  (AID) . 

The  technique,  as  indicated  by  its  title, "is  especially 
appropriate  for  examining  relationships  among  a number  of 
predictors  and  a criterion  variable  when  there  may  be  inter- 
action as  well  as  interrelation  between  the  predictor  vari- 
ables. Sonquist  discusses  fully  both  the  purposes  and  prob- 
lems of  AID  (Sonquist,  1970;  Sonquist  et  al . , 1971) . P.  E. 

Green  (1978,  190-201)  also  describes  AID  in  a more  general 

context.  Sonquist  and  Green  both  stress  that  AID  should  be 
used  as  a prel iminary  procedure  in  a research  program.  And 
they  both  argue  that  it  is  most  appropriate  to  large  data 
sets;  that  is,  those  with  sample  sizes  of  600  to  800  or 
larger.  Ironically,  Green  is  the  less  concerned  with  sample 
size  when  the  results  are  replicated  or  cross-validated  with 


other  analyses  (P.  E.  Green,  1978  , 200)  . 
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Automatic  Interaction  Detection  became  a part  of  the 
large  package  of  statistical  programs,  OSIRIS,  developed  at 
the  University  of  Michigan.  It  has  since  been  translated 
into  a number  of  other  packages  suitable  for  various  types 
of  computer  installations.  The  specific  form  of  AID  that 
was  used  here  is  AID-10  which  was  developed  at  the  Univer- 
sity of  Pittsburgh  for  use  on  PDP-10  computer  installations 
(AID-10 , 1976) . 

All  these  varieties  of  AID  use  essentially  similar  pro- 
cedures. They  require  a single  continuous  criterion  vari- 
able and  one  or  more  categorical  predictor  variables.  The 
original  criterion  variable  may  be  nominal,  ordinal,  or  in- 
terval, but  it  will  be  converted  into  a categorical  variable 
by  the  AID  routine.  The  number  of  variables  and  the  number 
of  categories  in  each  which  are  allowable  vary  from  instal- 
lation to  installation,  but  typical  systems  allow  twenty  to 
thirty  criterion  variables  with  ten  to  fifteen  categories 
per  variable . 

The  AID  process  is  a binary  splitting  process  that 
splits  the  original  group  into  two  classes  according  to  a 
single  predictor,  . For  each  categorical  criterion  vari- 
able the  AID  program  calculates 


_ 2 _ 2 
SSA  = m ^ Y + m2Y2 


2 
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where  SSA  is  the  between-subgroups  sum  of  squares, 
m = total  number  of  observations, 
m^  = number  of  observations  in  first  group, 

m2  = number  of  observations  in  second  group, 

Y = mean  of  criterion  variable,  total  group, 

= mean  of  criterion  variable,  first  subgroup, 

Y2  = mean  of  criterion  variable,  second  subgroup. 

AID  calculates  the  SSA  for  each  possible  binary  split 
for  each  of  the  predictor  variables  and  then  chooses  the 
split  that  gives  the  maximum  SSA.  If,  for  instance,  a vari- 
able has  four  categories,  the  following  splits  are  possible 
if  order  in  categories  is  to  be  maintained: 

Class  I Class  II 

1 2,  3,  4 

1,  2 3,  4 

1,  2,  3 4 

In  general,  if  there  are  G categories  in  a variable  and 

r 

order  is  to  be  maintained,  there  will  be  G-1  possible  splits. 
Thus,  if  we  have  five  predictor  variables  each  with  ten  cate- 
gories, AID  will  examine  f ive-times-nine , or  forty-five,  pos- 
sible splits  to  determine  the  split  that  maximizes  SSA,  or 
the  between-subgroups  sum  of  squares. 

Most  AID  programs  also  allow  for  categories  to  be  split 
in  any  order.  This,  of  course,  greatly  increases  the  number 
of  possible  splits  to  be  examined.  The  above  example  would 
produce  seven  possible  splits: 
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Class  I 


Class  II 


1 

2 

3 

4 


2,  3,  4 

1,  3,  4 
1,  2,  4 
1,  2,  3 


1,  2 
1,  3 
1,  4 


3,  4 
2,  4 
2,  3 


Any  other  splits  obviously  would  be  mirror  images  of 
the  above.  At  first  glance  it  appears  that  allowing  groups 
to  be  split  in  any  order  will  greatly  increase  the  computer 
costs.  However,  the  groups  are  first  ordered  on  the  basis 
of  the  means  of  the  criterion  variables  for  each  group  and 
the  splits  are  made  on  this  ordering.  Because  of  this  or- 
dering on  the  basis  of  the  criterion-variable  mean,  the 
splitting  process  may,  in  fact,  be  easier  than  when  the  nat- 
ural class  order  must  be  maintained. 

Once  AID  has  split  the  original  group  into  two  classes, 
it  examines  each  class  for  further  possible  splitting.  It 
first  examines  the  class  with  the  largest  "total"  sum  of 
squares  and  proceeds  to  examine  again  all  possible  splits  on 
each  predictor  variable.  The  splitting  process  continues 
until  one  of  the  various  stopping  criteria  is  reached  at 
each  branch  of  the  splitting  process.  The  stopping  criteria 
are  minimum  group  size,  minimum  sum  of  squares  within  a 
group,  and  minimum  be tween-subgroups  sum  of  squares. 

The  above  is  only  a capsule  description  of  the  capa- 
bilities and  operations  of  AID.  Any  prospective  user  should 
read  the  succinct  summary  in  P.  E.  Green  (1978) . Sonquist's 
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works  then  provide  detailed  step-by-step  instructions  in  the 
use  of  AID  (Sonquist,  1970;  Sonquist  et  al . , 1971). 

Results  of  AID  Analysis 

A number  of  AID  analyses  were  performed  on  the  data  in 
order  to  ascertain  all  possible  relationships  therein.  Each 
of  the  runs  used  DISTCBD,  or  distance  of  practice  from 
Public  Square,  Cleveland,  in  a straight  line,  as  the  crite- 
rion or  dependent  variable.  There  were  at  a maximum  seven 
predictor  variables; 

1.  YEAR  --  Date  at  which  observation  occurred. 

2.  STATE  --  Location  of  doctor's  medical  school. 

3.  SPEC  --  Level  of  doctor's  specialty. 

4.  BDATE  --  Date  of  doctor's  birth. 

5.  LICDATE  --  Date  doctor  was  licensed  to  practice. 

6 . NEWDR  — Whether  or  not  the  doctor  was  new  to  data 

1-,  LEAVER  --  Whether  or  not  the  doctor  left  the  data 

in  the  succeeding  data  year. 

Each  variable  was  coded  (or  recoded  if  originally  a con 
tinuous  variable)  into  levels.  These  levels  are  summarized 
in  Table  IV-1. 

The  first  AID  run  examined  all  2006  doctor-observa- 
tions for  all  data  years.  A minimum  group  size  of  twenty- 
five  was  specified  and  splitting  was  stopped  when  the  ratio 
of  SSA  to  Total  Sum  of  Squares  (TSS)  was  less  than  .05. 
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TABLE  IV-1 

LEVELS  OF  PREDICTOR  VARIABLES 


Variable 

Level 

Original  Data 

YEAR 

1 

1912 

2 

1921 

3 

1931 

4 

1940 

5 

1950 

6 

1960 

7 

1969 

STATE 

1 

Ohio 

2 

Other  U.  S. 

3 

Foreign 

SPEC* 

1 

General  Practice 

2 

Office-Based  Specialty 

3 

Hospital-Oriented  Specialty 

BDATE 

1 

Before  1880 

2 

1880, Through  1899 

3 

1900  Through  1919 

4 

1920  Through  1939 

5 

1940  Through  1959 

LICDATE 

1 

Before  1880 

2 

1880  Through  1889 

3 

1890  Through  1899 

4 

1900  Through  1909 

5 

1910  Through  1919 

6 

1920  Through  1929 

7 

1930  Through  1939 

8 

1940  Through  1949 

9 

1950  Through  1959 

10 

After  1959 

NEWDR 

1 

Doctor  New  to  Data  List 

2 

Old  Doctor  at  New  Location 

3 

Old  Doctor  at  Previous  Location 

LEAVER 

1 

Leaves  Data  List  Next  Data  Year 

2 

Moves  to  New  Location  Next  Year 

3 

Stays  at  Same  Location  Next  Year 

*See  Schultz  (1975)  . 


The  results  where: 


n = size  of  group,  and 

Y = mean  distance  of  group  from  CBD  in  miles, 
are  shown  in  Figure  26. 


LICDATE  < 1940  LICDATE  > 1940 


FIGURE  26 

TOTAL  POPULATION  OF  ALL  DOCTORS,  ALL  YEARS 

The  AID  program  terminated  here  since  no  further  splits 
satisfied  the  requirement  for  TSS  in  a group.  The  results 
indicate  that  there  are  quite  substantial  differences  be- 
tween doctors  licensed  in  1912  through  1931  and  doctors  li- 
censed between  1940  and  1969.  The  data  for  the  latter  per- 
iod indicate  a practice  location  almost  twice  as  far  from 
the  CBD  as  in  the  earlier  years.  This  is  not  unexpected 
given  the  earlier  plots  of  the  data;  however,  the  size  of 
the  difference  is  somewhat  greater  than  is  apparent  through 


visual  examination. 


75 


Although  no  further  splits  were  made  because  of  the 
specified  conditions,  the  next  split  of  both  new  groups  also 
would  have  been  made  on  license  year.  This  is  noteworthy 
because  of  the  importance  of  license  year  as  a determinant 
of  practice  location  in  the  following  analyses.  It  is  also 
indicative  of  the  power  of  AID  to  unravel  complex  problems. 

If  the  large  group,  or  those  licensed  before  1940,  had 
been  split  into  two  groups,  those  before  1920  would  have  a 
mean  DISTCBD  of  about  three  miles  and  those  licensed  in  the 
period  1921  to  1939  a mean  DISTCBD  of  about  four  miles.  In 
addition,  although  LICDATE  was  allowed  to  be  split  in  any 
order,  it,  nevertheless,  split  into  two  "natural"  chronolog- 
ical groups.  This  reinforces  the  observation  that  newer 
doctors  have  tended  to  locate  farther  and  farther  out  from 
the  central  city. 

In  order  to  provide  a clearer  picture  of  all  possible 
predictor  variables  operating  in  reference  to  DISTCBD,  Table 
IV-2  has  been  constructed  showing  the  mean  distance  to  the 
CBD  for  each  level  of  the  predictor  variables.  Within  the 
table  each  level  of  each  variable  is  ordered  by  mean  DISTCBD. 
The  table  was  derived  directly  from  AID  output. 

This  table  provides  an  overview  but  should  be  used  with 
considerable  caution  for  the  following  reasons: 

1.  Many  doctors  appear  in  the  data  several  times. 

2.  Without  cross-classification  the  data  may  be  some- 


what misleading. 
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TABLE  IV-2 

MEAN  DISTANCE  TO  CBD  (PUBLIC  SQUARE)  FOR  EACH  LEVEL  OF 
PREDICTOR  VARIABLES  FOR  ALL  DOCTORS  AND  ALL  YEARS 


Variable 

Level* 

N 

Mean  DISTCBD 

YEAR 

1 

158 

2.70 

2 

205 

3.28 

3 

264 

3.75 

4 

288 

3.90 

5 

330 

4.34 

6 

366 

5.26 

7 

395 

6.29 

STATE 

1 

1112 

4.17 

3 

19  3 

4.56 

2 

701 

5.04 

SPEC 

1 

1094 

3.79 

2 

351 

5.30 

3 

561 

5.43 

BDATE 

5 

2 

2.75 

1 

423 

3.27 

2 

585 

3.78 

3 

687 

4.94 

4 

309 

6.67 

LICDATE 

1 

38 

2.75 

2 

68 

2.89 

4 

242 

3.11 

3 

166 

3.37 

5 

288 

3.80 

6 

375 

4.24 

7 

326 

4.62 

10 

31 

5.86 

8 

312 

6.14 

9 

160 

7.18 

NEWDR 

3 

629 

4.29 

1 

952 

4.60 

2 

425 

4.64 

LEAVER 

2 

425 

4.08 

3 

629 

4.29 

1 

952 

4.85 

*See  Table  IV-1  for  an  explanation  of  levels. 
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For  instance,  it  is  shown  that  doctors  who  are  appearing  in 
the  data  for  the  last  time  (LEAVER  Level  3)  have  a greater 
mean  distance  from  the  CBD  than  do  the  other  classes.  This 
seems  to  contradict  the  earlier  statement  that  LEAVER  doc- 
tors are  concentrated  in  the  inner  city.  However,  all  data 
years  have  been  included  in  the  table  to  provide  a check  on 
the  totals  and  LEAVER  Level  3 is  dominated  by  1969,  the  last 
data  year  and  consequently,  the  year  in  which  all  doctors  are 
LEAVER  Level  3. 

To  avoid  these  problems  several  AID  runs  were  made  in 
which  only  new  doctors  and/or  moved  doctors  were  analyzed. 
Some  of  the  results  are  illustrated  below. 


ICDATE 


LICDATE  £ 1889  and  1890  < LICDATE  < 1899 

1900  < LICDATE  < 1939  LICDATE  > 1940  “ 


FIGURE  27 

AID  RUN  II:  1921-1969;  NEW  DOCTORS  (NEWDR  LEVEL  1) 
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n = 1219 
Y = 4.86 

Split  on  LICDATE 


n = 800 

n = 419 

Y = 4.01 

Y = 6.49 

LICDATE  ;< 

1939 

LICDATE  ^ 

FIGURE  28 

AID  RUN  III:  NEW  AND  MOVED  DOCTORS  (NEWDR  LEVELS  1 AND  2) 

Both  these  AID  runs  strongly  support  1940  as  a water- 
shed year  in  the  choice  of  new  practice  locations.  Those 
who  were  licensed  before  1940  tend  to  have  practice  loca- 
tions much  closer  to  the  CBD  than  those  licensed  afterwards. 
Table  IV-3  for  new  and  moved  doctors  emphasizes  this.  Note 
the  large  jump  between  levels  7 and  10. 

In  order  to  determine  differences  in  where  new  doctors 
located  over  the  data  years,  an  AID  run  was  made  for  each 
data  year  1921  through  1969.  The  year  1912  was  not  included 
since  all  doctors  were  new  doctors  that  year.  The  same  was 
done  for  new  doctors  and  movers  together  for  each  data  year 
1921  through  1969.  The  results  follow. 
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TABLE  IV-3 

MEAN  DISTCBD  FOR  ALL  DOCTORS  IN  NEWDR  CLASSES  1 AND  2 


Variable 

Level 

N 

Mean  DISTCBD 

LICDATE 

1 

14 

2.66 

2 

24 

3.27 

4 

94 

3.41 

5 

179 

3.70 

3 

57 

3.92 

6 

221 

4.12 

7 

211 

4.62 

10 

31 

5.86 

8 

243 

6.16 

9 

145 

7.19 

1921  New  Doctors;  Could  not  be  split. 


NEWDR  = 2 , 3 

(Formerly  in  Area) 

FIGURE 

1921  NEW  DOCTORS  AND 


NEWDR  = 1 
(New  in  Area) 

29 

MOVERS  TOGETHER 


New  doctors  in  1921  were  clearly  more  distant  from  the 


CBD  than  old  and  moved  doctors 
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1931  New  Doctors;  Could  not  be  split. 

1931  New  Doctors  and  Movers  Together:  Could  not  be  split 

Neither  attempt  for  1931  produced  a successful  split 


1940  New  Doctors;  Could  not  be  split. 


1940  New  Doctors  and  Movers  Together;  Could  not  be  split 
Neither  attempt  for  1940  produced  a successful  split 


STATE  = 3 STATE  = 1,  2 

(Foreign)  (All  U.  S.) 


FIGURE  30 
1950  NEW  DOCTORS 

This  split  produced  a clear  dichotomy  between  the 
close-in  location  of  new  foreign- trained  doctors  and  the 
much  more  distant  location  of  U.  S. -trained  doctors.  How 
ever,  the  number  of  new  foreign  doctors  locating  in  the 
study  area  was  still  small  in  1950. 
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n = 201 
Y = 4.56 

Split  on  STATE 


n = 17 

n = 184 

Y = 2.28 

Y = 4.81 

STATE  = 3 

STATE  = 1, 

(Foreign) 

(All  U.  S. 

FIGURE  31 

1950  NEW  DOCTORS  AND  MOVERS  TOGETHER 
This  result  merely  reinforces  that  found  for  new 


doctors  only  in  1950. 
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LEAVER  = 1 LEAVER  =2,  3 

(Leave  Area)  (Stay  in  Area) 


FIGURE  32 
1960  NEW  DOCTORS 

The  AID  runs  for  1960  produced  the  most  complex  results 
for  any  data  year.  The  first  split  is  clearly  between  ear- 
lier years  and  1940.  The  latter  group  is  composed  almost 
entirely  of  doctors  licensed  in  or  after  1940;  it  contains 
only  one  exception,  a single  doctor  licensed  before  1910. 
When  this  group  is  split  on  LEAVER,  there  is  a clear  dis- 
tinction between  doctors  who  disappear  from  the  data  by  1969 
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(the  next  data  year)  and  those  who  do  not.  The  leavers  are 
clearly  more  concentrated  in  the  inner  parts  of  the  study 
area.  These  results  show  a not  unexpected  interaction  be- 
tween date  of  license  and  the  year  in  which  a doctor  disap- 
pears from  the  data. 


LEAVER  = 1 LEAVER  =2,3 

(Leave  Area)  (Stay  in  Area) 


FIGURE  33 

1960  NEW  DOCTORS  AND  MOVERS  TOGETHER:  5%  SSA/TSS  CUTOFF 

This  reinforces  the  earlier  result  showing  the  tendency 
of  leavers  to  be  closer  in  toward  the  central  city  than 
others.  In  order  to  test  for  other  interactions,  1960  new 
doctor  and  mover  data  were  rerun  using  a 2.5%  instead  of  5% 
SSA/TSS  ratio  as  a cutoff  point.  This  produced  rather  in- 


teresting results. 
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n = 243 
Y = 5.77 

Split  on  LEAVER 


n = 83 

n = 160 

Y = 4.57 

Y = 6.39 

T 


LEAVER  = 1 
(Leave  Area) 

Split  on  STATE 


LEAVER  =2,3 
(Stay  in  Area) 


STATE  = 
(All  U. 


Split  on  LICDATE 


1,  2 

S . ) 


STATE  = 3 
( Foreign) 


n = 

28 

Y = 

7.99 

LICDATE  j<  1949  LICDATE  ^ 1950 

FIGURE  34 

1960  NEW  DOCTORS  AND  MOVERS  TOGETHER:  2.5%  SSA/TSS  CUTOFF 
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This  AID  run  produced  the  most  interesting  tree  of  all 
runs  and  indicates  some  rather  complex  interactions.  Among 
others  it  shows  that  LEAVER  status,  STATE  of  medical  school, 
and  LICDATE  (year  of  licensing)  all  interact  to  produce  a 
group  of  doctors  with  locations  very  distant  from  the  CBD , 

In  other  words,  doctors  who  reappear  in  the  data  in  1969,  who 
went  to  medical  school  in  the  United  States,  and  who  were 
licensed  in  or  after  1950  have  a practice  location  at  a mean 
distance  from  the  CBD  (7.99  miles)  that  is  almost  60%  grea- 
ter than  all  other  doctors  in  1960  (5.03  miles) . On  the 

other  hand,  new  doctors  who  reappear  in  1969,  who  trained  in 
the  U.  S.,  but  who  were  licensed  before  1950  have  a mean  dis- 
tance of  only  about  20%  greater  than  other  doctors  in  1960. 
These  manipulations  indicate  how  difficult  it  is  to  unravel 
the  causal  relationships  for  any  one  period. 


LICDATE  = Other 


1950  < LICDATE  < 1959 


FIGURE  35 
1969  NEW  DOCTORS 
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Once  again  license 
influence  on  location  al 
DISTCBD  between  the  two 
compared  to  that  in  1960 


year  appears  as  the 
though  proportional 
groups  appear  to  ha 


most  important 
differences  in 
ve  lessened  when 


LICDATE  = Other 


1940  < LICDATE  < 1959 


FIGURE  36 

1969  NEW  DOCTORS  AND  MOVERS  TOGETHER 

This  produced  results  similar  to  those  above  except 
that  the  larger  group  takes  in  more  years  . 

Summary  of  AID  Results 

The  complex  nature  of  the  Automatic  Interaction  Detec- 
tion routine  and  the  multiplicity  of  results  make  it  diffi- 
cult to  compile  a verbal  summary.  Tabulation  of  the  results 


(Table  IV-4)  indicates  some  conclusions. 
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TABLE  IV-4 

SUMMARY  OF  AID  RESULTS 


Dates 

Groups 

Included 

n SSA/TSS 

Cutoff 
Point 

Predictor 
Variable  for 
1st  Split 

Predictor 
Variable  for 
2nd  Split 

predictor 
Variable  for 
3rd  Split 

1921 

New  Drs 

96 

5% 

Not  Split 

— 

— 

1921 

Moved  S 
New  Drs 

156 

5% 

NEWDR 

Not  Split 

- 

1931 

New  Drs 

138 

5% 

Not  Split 

- 

- 

1931 

Moved  & 
New  Drs 

211 

5% 

Not  Split 

- 

- 

1940 

New  Drs 

103 

5% 

Not  Split 

- 

- 

1940 

Moved  & 
New  Drs 

168 

5% 

Not  Split 

- 

- 

1950 

New  Drs 

130 

5% 

STATE 

Not  Split 

- 

1950 

Moved  & 
New  Drs 

201 

5% 

STATE 

Not  Split 

- 

1960 

New  Drs 

182 

5% 

LICDATE 

LEAVER 

- 

1960 

Moved  & 
New  Drs 

243 

5% 

LEAVER 

Not  Split 

- 

1960 

Moved  & 
New  Drs 

243 

2.5% 

LEAVER 

STATE 

LICDATE 

1969 

New  Drs 

146 

5% 

LICDATE 

Not  Split 

- 

1969 

Moved  & 
New  Drs 

241 

5% 

LICDATE 

Not  Split 

- 

1912-1969 

All  Drs 

2006 

5% 

LICDATE 

Not  Split 

- 

1912-1969 

New  Drs 

795 

5% 

LICDATE 

Not  Split 

- 

1912-1969 

Moved  & 
New  Drs 

1219 

5% 

LICDATE 

Not  Split 

- 
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It  appears  that  prior  to  about  1950  the  predictor 
variables  used  in  this  study  accounted  for  very  little  of 
the  differentiation  among  doctors.  This  result  probably 
has  several  contributing  factors  . First  is  that  doctors 
were  relatively  more  concentrated  before  this  time,  thus 
making  it  more  difficult  to  separate  them  into  groups  based 
on  distance  from  the  CBD.  Moreover,  the  small  sample  sizes 
prior  to  1950  almost  certainly  contribute  to  AID'S  failure 
to  differentiate  doctors.  It  also  must  be  admitted  that 
doctors'  locations  before  1950  may  have  been  differentiated 
on  variables  other  than  the  ones  chosen  here.  Finally,  it 
is  possible  that  there  may  have  been  little  or  no  differen- 
tiation among  doctors  by  location.  After  1950  location  of 
the  doctors'  medical  schools  apparently  was  the  dominant 
factor  differentiating  among  doctors  for  a period  but  was  • 
replaced  in  importance  by  license  year  in  1969. 

Close  study  of  the  original  AID  runs  indicates  far 
more  than  is  contained  in  this  brief  summary.  Examination 
of  all  results  shows  that  LICDATE  was  important  all  through 
the  time  period  of  the  research.  In  many  cases  where  a 
split  was  not  completed  because  of  the  cutoff  rules,  LICDATE 
would  have  been  the  criterion  variable  if  the  next  split  had 
been  made . 

On  the  basis  of  AID  analysis  the  best  interpretation 
is  that  license  year  has  played  the  dominant  role  in  deter- 
mining the  doctors'  practice  locations  as  shown  by  distance 
from  the  CBD.  Of  secondary  importance  has  been  the  location 
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of  doctors'  medical  schools.  However,  the  interaction  of 
all  factors  appears  very  complex  and  much  more  analysis  could 
be  done  with  AID  in  relation  to  other  possible  variables  as 
well  as  other  areas  of  Cleveland  and  other  research  locations. 

Regression  Analysis 

In  order  to  further  test  influences  on  doctors  ' loca- 
tions measured  by  distance  from  the  CBD,  multiple  regression 
analysis  was  performed  with  DISTCBD  as  the  Y-  or  dependent 
variable  and  with  the  predictor  variables  as  in  the  AID  anal- 
ysis. The  simple  correlation  matrix  is  shown  in  Table  IV-5  . 
Regressions  for  individual  years  did  not  produce  especially 
meaningful  results  because  all  had  multiple  Rs  below  .40 
except  1960  which  reached  .42.  This  indicates  that  no  equa- 
tion predicted  more  than  (.42)^  or  about  18%  of  the  vari- 
ation in  the  dependent  variable. 

A similar  analysis  was  performed  for  all  doctors  in  all 
years  and  a multiple  R of  .45  was  obtained  giving  an  R of 
.20.  LICDATE  accounts  for  19%  of  the  total  variation  and  no 
other  variable  except  SPEC  is  statistically  significant  in  an 
inferential  sense.  The  results  are  summarized  in  Table  IV-6L 

Although  some  of  the  multiple  regression  analysis  resuls 
are  interesting  in  a suggestive  rather  than  a statistical 
sense,  care  must  be  taken  not  to  make  too  much  of  them.  It 
appears  that  date  of  license  and  specialty  level  are  both 
positively  related  to  distance  from  the  CBD  and  that  the  re- 
lationship is  statistically  significant.  In  other  words,  the 


TABLE  IV-5 

CORRELATION  MATRIX,  ALL  DOCTORS,  ALL  YEARS 
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later  the  license  (newer  the  doctor) , the 
is  from  the  CBD;  and  the  higher  the  level 
specialty,  the  farther  he/she  is  from  the 


farther  he  or 
of  his  or  her 
CBD  . 


she 


TABLE  IV-6 

RESULTS  OF  MULTIPLE  REGRESSION  ANALYSIS,  ALL  DOCTORS,  ALL  YEARS 


Dependent 

Variable 

Independent  Beta 

Variable 

F 

r2 

Simple  R 

LICDATE 

.44 

28.7 

.19 

.43 

SPEC 

.12 

27.7 

.20 

.28 

STATE 

-.06 

6.5 

.20 

.11 

LEAVER 

-.04 

4.2 

.20 

-.10 

BDATE 

-.05 

0.3 

.20 

.42 

NEWDR 

.00 

0.0 

.20 

-.05 

STATE,  LEAVER,  and  BDATE  all  have  slightly  negative 
Beta  values  which  suggests  that  when  the  other  variables 
are  held  constant: 

1.  The  farther  from  Ohio  the  doctor's  medical  school, 
the  closer  his  practice  location  will  be  to  the 


CBD  . 

2.  Leavers  are  closer  in  to  the  CBD  than  those  who 
remain . 

3.  The  younger  the  doctor,  the  closer  he  is  to  the 


CBD  . 
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The  last  finding,  in  particular,  is  interesting  since 
it  somewhat  contradicts  earlier  indications  that  newer  doc- 
tors locate  farther  from  the  CBD  than  older  ones.  It  ap- 
pears that  if  LICDATE  (and  the  other  variables)  are  held 
constant,  there  is  a very  slight  tendency  for  younger  doc- 
tors to  be  closer  in.  But  once  again,  caution  should  be 
taken  against  forming  major  conclusions  on  the  basis  of 
these  non-statistically  significant  results. 

It  should  be  recalled  that  location  in  the  AID  and  re- 
gression analyses  has  been  measured  as  a univariate  vari- 
able, distance  from  the  CBD.  This  is  a limited  and  per- 
haps misleading  way  of  measuring  location  in  as  complex  a 
system  as  a large  city.  Therefore,  the  following  analyses 
attempt  to  give  location  a more  complex  multivariate  measure. 


CHAPTER  V 

ANALYSIS  OF  DATA  2 

LOCATION  AS  A MULTIVARIATE  DEPENDENT  MEASURE 
CANONICAL  CORRELATION  ANALYSIS 

Each  of  the  earlier  methods  of  analysis  has  treated 
location  as  univariate;  i.  e.,  as  the  distance  from  the  CBD . 
Distance  itself  is  derived  from  two  variables,  the  X-coor- 
dinate  and  the  Y-coordinate  of  the  doctor's  practice.  It  is 
not  possible,  however,  to  examine  the  influence  of  each 
separately . 

Several  statistical  techniques  have  been  developed 
during  the  past  twenty  years  that  enable  the  researcher  to 
treat  the  dependent  variable  as  having  multiple  dimensions. 
These  are  described  in  a number  of  new  "handbooks"  to  multi- 
variate data  analysis.  Among  the  best  are  those  by  Harris 
(1975)  , Gnanadesikan  ( 1977)  , and  P.  E.  Green  (1978)  . Des- 
cribed techniques  include  canonical  correlation  and  multi- 
variate analysis  of  variance  and  covariance  (MANOVA  and 
MANCOVA) . 

Another  method  of  handling  dependent  variables  with 
multiple  dimensions  is  reduction  of  the  number  of  dimensions 
through  one  of  the  methods  Green  refers  to  as  reduced  space 
analysis  (P.  E.  Green,  1978,  341)  . Among  these  techniques 

are  factor  analysis,  multidimensional  scaling,  and  cluster 
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analysis.  It  is  probably  possible  to  apply  all  these  tech- 
niques to  the  present  problem,  but  time,  money,  and  space 
limitations  led  to  a decision  to  apply  three  techniques  that 
held  the  most  initial  promise.  These  were  canonical  cor- 
relation, factor  analysis,  and  cluster  analysis.  Of  these 
three  only  two  will  be  discussed  in  detail  since  factor 
analysis  of  the  dependent  variables  added  little  to  what  was 
previously  known. 

The  least  rewarding  technique,  factor  analysis,  inclu- 
ding principal  components,  is  well  known  to  geographers  and 
has  been  widely  used  over  the  last  twenty  or  thirty  years. 

In  the  present  study  principal  component  analysis  without 
rotation  was  applied  to  two  different  sets  of  dependent 
variables.  The  first  set  included  only  the  X-  and  Y-coor- 
dinates  (XCOOR  and  YCOOR)  of  the  doctors'  locations;  the 
second  set  included  these  two  variables  plus  a third,  dis- 
tance from,  the  CBD  (DISTCBD)  . This  analysis  altered  the 
first  presumption  (see  page  38)  so  that  now  it  looks  like 
either  of  the  two  following  statements: 

Presumption  1(a) : 

(XCOOR,  YCOOR)  =f(YEAR,  BDATE,  LICDATE,  STATE,  SPEC) 

or 

Presumption  1(b)  : 


(XCOOR,  YCOOR,  DISTCBD)  = 

f(YEAR,  BDATE,  LICDATE,  STATE,  SPEC). 
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Rotation  was  not  performed  because  rotation  to  simple  struc- 
ture with  such  small  numbers  of  variables  merely  produces 
factors  that  are  virtually  indentical  to  the  original  variables. 

The  X-  and  Y-coordina tes  of  all  doctors  for  all  years 
have  a correlation  of  only  .23  (see  Table  IV-5) ; in  other 
words,  there  is  a slight  tendency  for  the  Y-coordinate  to  in- 
crease as  the  X-coordinate  increases.  A quick  glance  at  the 
map  confirms  this.  It  is  possible  to  derive  a factor  on 
which  each  variable  loads  significantly  and  which  contains 
62%  of  the  total  variance  of  the  data. 

Moreover,  when  DISTCBD  was  added  to  the  dependent  vari- 
able set  and  principal  component  analysis  was  applied  to  all 
three  variables,  XCOOR,  YCOOR,  and  DISTCBD,  a first  factor 
was  derived  which  accounts  for  72%  of  the  total  variance. 

The  results  of  these  .two  factors  are  summarized  in  the  fol- 
lowing table;  it  should  be  remembered  that  Factors  A and  B 
were  derived  in  separate  principal  component  analyses. 


TABLE  V-1 

SUMMARY  OF  PRINCIPAL  COMPONENT  ANALYSIS: 
LOADINGS  OF  EACH  VARIABLE  ON  RESPECTIVE  FACTORS 


Factor  A 

Factor  B 

XCOOR 

.79 

.94 

YCOOR 

.79 

.56 

DISTCBD 

Not  Included 

-.99 

Eigenvalue 

1.24 

2.17 

% of  Variance  Explained 

61.9% 

72.2% 
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Factor  scores  were  calculated  from  each  of  the  above 
principal  component  results  and  these  were  used  as  dependent 
variables  in  multiple  regressions  similar  to  that  above.  As 
expected  from  the  structure  of  the  factors,  the  results  of 
the  regression  were  virtually  identical  to  the  earlier  mul- 
tiple regression  with  the  exception  that  multiple  regression 
on  factor  scores  derived  only  from  XCOOR  and  YCOOR  was  slight- 
ly less  good  than  that  on  DISTCBD  only  while  the  multiple 
regression  on  factor  scores  derived  from  XCOOR.  YCOOR,  and 
DISTCBD  was  slightly  better  than  that  on  DISTCBD  alone.  This 
suggests  that  DISTCBD  is  the  most  important  component  in  a 
doctor's  location,  but  that  knowing  the  X-  and  Y-  coordinates 
contributes  slightly  to  understanding  the  reasons  for  his  or 
her  location  choice. 

It  was  decided  to  capitalize  on  this  knowledge  by  per- 
forming canonical  correlation  (CANCORR)  analysis  on  the  data. 
CANCORR  is  not  unknown  in  geographical  research  (see,  for 
instance,  Monmonier,  1972)  but  its  use  has  been  much  less 
prevalent  than  factor  analysis. 

In  canonical  correlation  the  purpose  is  to  find  "a 
linear  composite  of  the^  y-variables  and  a (different)  linear 
composite  of  the  x-variables  so  that  when  this  pair  of  de- 
rived variables  (i.  e.,  pair  of  linear  composites)  is  cor- 
related, the  resulting  two-variable  correlation  is  the  high- 
est attainable"  (P.  E.  Green,  1978,  260-261).  CANCORR  may 

be  thought  of  as  doing  principal  component  analysis  on  the 
sets  of  variables  on  each  side  of  the  equation,  the  predictor 
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and  the  criterion,  in  such  a way  that  the  principal  compo- 
nents derived  for  each  side  are  maximally  correlated.  The 
presumptions  for  CANCORR  are  effectively  the  same  as  for 
factor  analysis. 

When  this  is  completed,  it  is  then  possible  to  derive  a 
second  pair  of  linear  composites,  normally  uncorrelated  with 
the  first  pair,  and  chosen  so  that  their  correlation  is  maxi- 
mal. If  there  are  p predictor  variables  and  c criterion 
variables  (p  + c = m) , then  there  will  be  either  p or  c pos- 
sible pairs  depending  on  which  is  the  smaller.  The  correla- 
tions between  successive  pairs  will  become  continually 
smaller  (or  at  least  no  larger)  until  some  cutoff  point  is 
reached  before  the  limiting  number  of  pairs. 

CANCORR  was  performed  on  the  data  for  all  doctors  in 
all  years  and  the  results  are  recorded  in  Table  V-2.  Anal- 
ysis of  the  results  reveals  that  three  canonical  variates 
(CANVARs) , the  maximum  possible  given  three  criterion  vari- 
ables and  four  predictor  variables,  were  extracted  from  the 
data.  Although  all  three  canonical  variates  were  statisti- 
cally  significant  when  tested  with  Wilks'  A and  x > only  the 
first  accounted  for  a substantial  amount  of  the  variance. 

The  first  CANVAR  accounted  for  21%  of  the  variance  in 
the  data--an  eigenvalue  of  .21  or  the  canonical  correlation 
squared.  The  table  of  canonical  coefficients  shows  that 
XCOOR  and  DISTCBD  of  the  criterion  variables  correlate 
moderately  high  on  this  variable  and  that  LICDATE  among  the 
predictor  variables  is  very  highly  correlated  with  it.  Both 
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YCOOR  and  SPEC  are  correlated  with  CANVAR  1 moderately  al- 
though probably  significantly.  Interestingly,  both  SPEC  and 
BDATE  apparently  are  unrelated  to  CANVAR  1,  despite  the  fact 
that  BDATE  and  LICDATE  have  a simple  correlation  of  +.97. 

TABLE  V-2 

CANCORR  RESULTS,  ALL  DOCTORS,  ALL  YEARS 


Canonical 

Variate 

Number 

Eigenvalue 

Canonical 

Cor- 

relation 

Wilks ' 
X 

d.  f. 

Significance 

1 

.21 

.45 

.77 

515.2 

12 

0.000 

2 

.02 

.15 

.97 

56.1 

6 

0.000 

3 

.01 

.08 

.99 

11.9 

2 

0.003 

CANONICAL  COEFFICIENTS,  FIRST  SET 

CANVAR  1 

CANVAR  2 

CANVAR  3 

XCOOR 

0.45 

2.02 

-7.07 

YCOOR 

0.22 

1.52 

-1.29 

DISTCBD 

-0.45 

■ 2.74 

-7.45 

CANONICAL 

COEFFICIENTS,  SECOND 

SET 

CANVAR  1 

CANVAR  2 

CANVAR  3 

SPEC 

-0.23 

0.88 

0.62 

STATE 

0.09 

-0.50 

0.75 

BDATE 

0.04 

-1.24 

1.83 

LICDATE 

-0.95 

0.80 

-2.37 
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On  CANVAR  2 BDATE  has  a 
terion  variables.  However, 
over  2%  of  the  variability  i 
for  much  less  that  1%  of  the 
These  results  suggest  s 
tive  conclusions.  The  first 
important  relationship  betwe 
and  DISTCBD  of  the  other,  an 
lying  factor  in  the  data.  A 
relationship  exists  between 
and  BDATE.  A third  result, 
thesis  of  possible  causal  re 
two  canonical  variates  appea 
one  side  of  the  relationship 
CANVAR  2 is  very  much  domina 
whereas  CANVAR  1 is  dominate 
the  predictor  variables. 

Despite  the  above  caveat, 
clearly  has  been  the  most  impor 
location  of  doctors  on  the  west 
during  the  study  period  and  that 
significance.  BDATE  appears  to 
ence  on  location  separate  from  L 
predictor  variables  are,  themsel 
Although  canonical  correlat 
some  interesting  and  highly  sugg 


high  loading  as  do  all  the  cri- 
CANVAR  2 accounts  for  only  just 
n the  data.  CANVAR  3 accounts 
total  variance . 

ome  provocative  although  tenta- 
is  that  there  is  a moderate  but 
en  LICDATE  on  one  hand  and  XCOOR 
d that  it  is  the  dominant  under- 
secondary, but  still  significant 
all  three  criterion  variables 
although  rather  damaging  to  our 
lationships,  is  that  each  of  the 
rs  to  be  very  much  dominated  by 
--either  criterion  or  predictor, 
ted  by  the  criterion  variables 
d,  albeit  to  a lesser  degree,  by 

it  appears  that  LICDATE 
tant  variable  determining  the 
side  of  the  Cleveland  SMSA 
SPEC  has  been  of  secondary 
have  an  independent  influ- 
ICDATE  although  these  two 
ves,  highly  correlated, 
ion  technique  has  provided 
estive  results,  and  is 
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worthy  of  further  investigation,*  it  possibly  suffers  from 
treating  location  as  a continuous  linear  although  multivari- 
ate measure.  Examination  of  the  plots  for  doctors'  loca- 
tions suggested  that  the  distributions  are  not  continuous 
and  also  not  bivariate  normal.  On  the  basis  of  this  intui- 
tive assessment  and  the  suggestive  but  not  overwhelming  re- 
sults obtained  from  AID,  factor,  and  CANCORR  analyses,  it 
was  decided  to  try  techniques  which  may  handle  data  of  this 
type.  Cluster  analysis  was  the  obvious  choice. 


*CANC0RR  was  applied  to  the  data  for  some  of  the  indi- 
vidual years,  but  most  of  the  results  were  not  significant. 


CHAPTER  VI 
ANALYSIS  OF  DATA  3 

LOCATION  AS  A NONCONTINUOUS  OR  NOMINAL  MEASURE 


Cluster  Analysis  with  Discriminant  Analysis 

Cluster  analysis  is  a procedure  which  classifies  or 
groups  data  observations  on  the  basis  of  their  scores  on  a 
number  of  variables.  It  is  a type  of  reduced  space  analysis 
similar  in  some  respects  to  factor  analysis;  however,  in 
cluster  analysis  one  is  interested  in  reducing  the  number  of 
observations  or  rows  of  the  data  matrix  to  a smaller  number 
by  grouping  the  observations  into  clusters . The  aim  is  to 
produce  clusters  of  observations  such  that  observations 
within  the  same  cluster  are  more  like  each  other  than  they 
are  like  objects  in  any  other  cluster. 

Cluster  analysis  is  a relatively  old  technique  (Sokal 
and  Sneath,  1963;  Tryon  and  Bailey,  1970)  and  has  undergone 
considerable  development  over  the  last  few  years  in  both 
methodology  (Anderberg,  1973;  Everitt,  1974,  Hartigan,  1975) 
and  in  the  computer  programs  that  perform  the  clustering 
routines  (Wishart,  1975)  . Geographers  have  applied  cluster 
analysis  to  a number  of  problems  and  the  similarity  of  it  to 
other  spatial  analytic  techniques  virtually  assures  that  its 
use  within  the  discipline  will  grow. 
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As  in  all  procedures  where  a multiplicity  of  techniques 
is  available,  the  user  of  cluster  analysis  is  faced  with  the 
task  of  selecting  the  most  appropriate  one  for  his  particular 
problem.  The  process  of  selection  remains  an  individual  de- 
cision; Sherman  and  Sheth  (1977,  194)  conclude,  "Human  judg- 

ment is  the  single  most  important  factor  in  the  generation 
of  meaningful  clustering  results." 

The  problem  here  was  relatively  simple;  the  goal  was  to 
produce  or  find  "meaningful"  clusters  of  doctors  as  plotted 
in  two-dimensional  space,  with  the  two  dimensions  measured 
in  equal  units  and  the  dimensions  treated  as  being  of  equal 
importance.  Nevertheless,  it  was  still  possible  to  apply 
any  number  of  clustering  techniques. 

Ward's  algorithm  was  the  first  choice  because  it  has 
been  applied  in  other  situations  with  good  results  by  geog- 
raphers (Spurlock,  1978)  and  is  widely  available.  A full 
description  of  Ward's  algorithm  maybe  found  in  Ward  (1963) 
or  in  one  of  the  reviews  of  cluster  analysis  (Frank  and 
Green,  1968;  Anderberg,  1973;  Bijnen,  1973;  Everitt,  1974). 

Cluster  analysis  may  be  a useful  tool  for  clustering  or 
classifying  doctors  by  location  but  it  does  not  answer  the 
question  of  why  they  are  in  those  clusters.  Cluster  analysis 
may  be  thought  of  as  a technique  which  transforms  a group  of 
continuous  variables  into  a single  nominal  variable  with  as 
little  loss  of  information  as  possible.  What  was  needed  in 
the  present  study,  then,  was  a technique  which  relates  a 
single  nominal  criterion  variable  to  a number  of  predictor 
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variables  of  varying  types.  Discriminant  analysis  was  the 
preferred  method. 

Discriminant  analysis,  or  multiple  discriminant  anal- 
ysis (MDA)  as  it  is  named  when  the  criterion  variable  has 
more  than  two  classes,  is  a technique  that  attempts  to  find 
the  linear  combination  of  predictor  variables  which  has  the 
highest  possible  correlation  (canonical)  with  the  nominal 
criterion  variable.  More  specifically,  MDA  has  the  following 
specialized  objectives  (P.  E.  Green,  1978  , 297)  : 

1.  To  find  linear  composites  (canonical  variates)  of 
the  predictor  variables  which  maximize  the  aver- 
age-group to  within-group  variability  of  the  cri- 
terion variable. 

2.  To  test  for  "true;"  i.  e.,  statistically  signifi- 
cant differences  in  the  group  centroids. 

3.  To  find  which  predictor  contributes  most  to  dis- 
criminating among  the  groups  . 

MDA  is  also  frequently  used  to  classify  new  observations 
to  groups  on  the  basis  of  the  observations'  predictor-vari- 
able  scores.  Or  alternatively,  assignment  to  groups  of  the 
original  observations  is  done  on  the  basis  of  predictor-var i- 
able  scores  and  these  results  are  matched  against  the  actual 
group  to  which  the  observation  belongs.  This  constitutes  one 
test  of  the  MDA  procedure. 
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The  procedure  used  herein  has  three  parts: 

1.  Cluster  analysis  of  the  new  doctors'  locations  for 
each  data  year  (except  1912)  . Ward's  algorithm 
was  used  (although  certain  other  algorithms  were 
experimented  with)  and  an  arbitrary  number  of 
groups  (5)  was  produced  for  each  year.  (Again 
other  numbers  of  groups  were  experimented  with  and 
some  of  the  results  are  reported  below.) 

2.  Use  of  MDA  to  find  the  best  linear  combination  of 
the  predictor  variables  that  will  account  for  the 
groupings  of  the  doctors. 

3.  Use  of  this  MDA-derived  linear  combination  to  as- 
sign each  doctor  to  a group  on  the  basis  of  his 
predictor  variables.  These  assignments  were  then 
checked  against  his  actual  location  or  cluster  to 
test  the  accuracy  of  the  MDA  results. 

The  results  of  the  above  analyses  do  not  lend  themselves 
easily  to  summary  form  (as  expected  of  so  complex  a problem) 
but  are  most  significant  in  their  implication.  They  are 
summarized  in  chronological  order  on  the  following  pages. 
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TABLE  VI-1 

CLUSTER  AND  DISCRIMINANT  ANALYSES  RESULTS  FOR  NEW  DOCTORS:  1921 


Cluster  Analysis  (See  Figure 

37) 

Cluster 

Size 

Circle  Radius 

Verbal  Description 

1 

31 

2.18 

Outer  City 

2 

27 

1.58 

Lakewood 

3 

29 

1.38 

Inner  City 

4 

7 

0.73 

Metropolitan 

Hospital 

5 

2 

3.92 

Far  Suburban 

96 

Standardized 
FUNC  1 

Discriminant 
FUNC  2 

Function  Coefficients 
FUNC  3 

FUNC  ■ 

LICDATE 

-.15 

.49 

.65 

1.36 

BDATE 

-.81 

-.55 

-.31 

-1.17 

SPEC 

-.21' 

-.59 

.30 

.22 

STATE 

.02 

.45 

.70 

-.59 

LEAVER 

.69 

-.51 

.26 

-.27 

Canonical 

.29 

.23 

.10 

.02 

Correlation 


Cluster 

Centroids 
FUNC  1 

of  Clusters 
FUNC  2 

in  Reduced  Space 
FUNC  3 

FUNC  4 

-.25 

.20 

.05 

-.00 

.02 

-.19 

.08 

.00 

.11 

-.18 

-.09 

.00 

.09 

.43 

-.23 

.00 

1.68 

.59 

.24 

-.01 

Prediction  Results:  27.1%  Correctly  Classified 
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1921  New  Doctors 

The  cluster  analysis  produced  three  large  clusters  and 
two  outlying  smaller  ones.  The  three  large  classes  are  la- 
belled Inner  City,  Outer  City,  and  Inner  Suburban;  the  other 
two  are  labelled  Metropolitan  Hospital  and  Outer  Suburban. 

The  MDA  results  suggest  that  there  are  two  discriminant 
functions  that  are  fairly  important  in  accounting  for  the 
particular  cluster  into  which  a doctor  falls,  a third  that 
is  moderately  useful,  and  a fourth  that  is  of  negligible  im- 
portance. The  first  discriminant  function  (FUNC  1)  has  as 
its  most  important  components  BDATE  and  LEAVER.  The  second 
discriminant  function  (FUNC  2)  has  all  five  predictor  vari- 
ables as  approximately  equal  components.  The  third  function 
(FUNC  3)  has  LICDATE  and  STATE  as  its  most  important  parts. 
FUNC  1 is  most  useful  for  separating  the  doctors  in  Cluster  5 
from  other  groups,  particularly  Cluster  1 (see  Centroids  sec- 
tion of  Table  VI-1) . On  the  other  hand,  FUNC  2 is  good  at 
separating  doctors  in  Clusters  4 and  5 from  those  in  Clusters 
2 and  3.  Finally,  FUNC  3 separates  Cluster  5 from  Cluster  4 
with  the  other  clusters  falling  in  between. 

The  prediction  results  indicate  that  the  relationship 
produced  from  the  MDA  is  successful  at  predicting  only  about 
27%,  or  a little  over  one-fourth,  of  the  doctors'  actual 
clusters.  This  may  seem  a poor  result  but  it  should  be  noted 
that  a random  assignment  of  doctors  to  five  groups  would 
only  on  average  locate  20%  of  them  in  their  right  groups. 
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Therefore,  our  MDA  procedure  is  approximately  one-third 
(7/20)  more  accurate  than  uninformed  guessing. 

By  way  of  summary  it  can  be  stated  that  date  of  birth 
and  whether  or  not  doctors  leave  the  area  separate  the  Outer 
Suburban  doctors  from  the  rest.  On  the  other  hand,  all  the 
predictor  variables  working  together  seem  to  separate  both 
Outer  Suburban  and  Metropolitan  Hospital  doctors  from  the 
others.  And  finally,  license  date  and  origin  of  the  doctors 
separate  Outer  Suburban  doctors  from  Metropolitan  Hospital 
doctors  with  the  rest  falling  in  between  on  this  function. 

In  all  the  above  analyses,  SPEC,  STATE,  and  LEAVER  have 
been  treated  as  continuous  variables  although,  in  fact,  they 
are  essentially  nominal  variables.  Given  the  importance 
that  these  three  have  in  the  MDA  results,  it  was  decided  to 
break  each  of  these  up  into  dummy  variables  and  rerun  the 
analyses.  For  instance,  SPEC  has  been  recoded  as  SPEC  1, 

SPEC  2,  or  SPEC  3 according  to  the  specialist  level  of  the 
doctor.  If  the  doctor  were  a second-level  specialist,  SPEC  1 
would  have  a value  of  0,  SPEC  2 would  be  1,  and  SPEC  3 would 
be  0.  In  order  to  avoid  problems  of  singularity  in  the  ma- 
trix only  the  first  two  recorde4  variables  were  entered  in 
the  analysis.  Hence,  only  SPEC  1 and  SPEC  2 entered  into 
the  calculations . 

The  results  in  Table  VI-2  are  for  1921  with  the  recoded 
variables.  The  MDA  and  prediction  steps  only  are  shown  since 
the  cluster  analysis  is,  of  course,  unaffected. 
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TABLE  VI-2 

CLUSTER  AND  DISCRIMINANT  ANALYSES  RESULTS  FOR  NEW  DOCTORS:  1921 

WITH  DUMMY  VARIABLES  FOR  SPEC,  STATE,  AND  LEAVER  FROM  TABLE  VI-1 


Standardized 
FUNC  1 

Discriminant 
FUNC  2 

Function  Coefficients 
FUNC  3 

FUNC  4 

LICDATE 

-.64 

.76 

.10 

.41 

BDATE 

.56 

.11 

.26 

-.16 

SPEC  1 

-.94 

-.44 

.02 

-.20 

SPEC  2 

-.73 

-.54 

.39 

.62 

LEAVER  1 

-.53 

.74 

-.80 

.31 

LEAVER  2 

-.22 

.09 

-1.17 

.84 

STATE  1* 

-.04 

.30 

-.05 

.31 

Canonical 

Correlation 

.32 

.27 

.14 

.09 

Cluster 

Centroids 
FUNC  1 

of  Clusters 
FUNC  2 

in  Reduced  Space 
FUNC  3 

FUNC  4 

1 

-.16 

.73 

.02 

1 

o 

2 

-.00 

-.11 

.09 

-.08 

3 

-.18 

-.12 

.02 

.13 

4 

-.33 

-.11 

-.47 

-.08 

5 

.90 

.05 

-.11 

.05 

Prediction  Results:  42.7%  Correctly  Classified 


*There  was  no  STATE  3;  therefore,  STATE  2 had  to  be  dropped  to 
prevent  singularity. 
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1921  New  Doctors  with  Dummy  Variables  for  the  Classes  of 

SPEC,  STATE,  and  LEAVER 

The  above  results  are  apparently  much  better  than  those 
in  the  previous  attempt  at  MDA  since  43%  were  classified  cor- 
rectly in  the  prediction  step.  However,  this  is  probably 
less  important  than  the  changes  in  the  discriminant  function 
structure.  Concentration  on  the  loadings  with  an  absolute 
value  of  more  than  .70  shows  that  FUNC  1 is  primarily  rela- 
ted to  the  two  specialty  codes.  Since  FUNC  1 is  again  most 
useful  at  separating  Outer  Suburban  doctors  from  the  others, 
it  appears  that  specialist  class  is  the  determining  charac- 
teristic of  Outer  Suburban  doctors.  Care  must  be  taken  in 
interpreting  the  results  because  the  influence  of  the  dummy 
variables  not  present  in  the  analysis  must  be  inferred  from 
the  two  related  dummy  variables.  For  instance,  since  both 
SPEC  1 and  SPEC  2 have  a high  loading  on  FUNC  1,  by  inference 
SPEC  3 has  a very  low  loading  on  FUNC  1.  On  the  other  hand, 
LEAVER  1 and  LEAVER  2 both  have  moderate  loadings  on  FUNC  1 
so  LEAVER  3 may  also  have  a moderate  loading.  From  this 
method  of  interpretation  we  can  infer  that  STATE  3 may  also 
have  a very  high  loading  on  FUNC  1. 

A continuation  of  the  above  interpretation  for  the  other 
functions  suggests  that  FUNC  2 is  primarily  related  to 
LICDATE,  LEAVER  1,  and  possibly  STATE  2 and  is  best  at  sepa- 
rating Cluster  1 doctors  from  the  others.  In  other  words. 
Outer  City  doctors  seem  to  be  separated  from  the  others  by 
the  year  in  which  they  were  licensed,  whether  they  leave  the 
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data  next  data  year,  and  whether  or  not  they  were  originally 
from  out-of-state.  FUNC  3 is  related  very  strongly  to  the 
permanence  measure  and  is  best  at  differentiating  Metropoli- 
tan Hospital-centered  doctors  from  the  rest.  FUNC  4 is  rela 
ted  to  LEAVER  2 and  SPEC  2 but  does  not  strongly  differenti- 
ate any  group  of  doctors  from  the  others. 

The  above  interpretation  conveys  something  of  the  rich- 
ness (and  difficulty)  of  MDA  as  an  analytic  technique.  The 
results  for  the  remaining  years  will  not  be  discussed  in  de- 
tail but  are  reported  fully  in  tabular  form. 

Further  analysis  was  done  without  the  LEAVER  variable 
in  order  to  produce  a set  of  predictor  variables  that  re- 
flects the  doctor's  situation  at  the  time  he  or  she  took  up 
practice  at  a given  location.  It  may  be  argued  that  since 
we  were  here  looking  at  only  doctors  new  to  the  area,  the 
LEAVER  variable  is  at  least  a crude  measure  of  the  doctor's 
intent  to  remain  in  the  area  and,  as  suclr,  is  also  a measure 
of  the  doctor's  pre-location  set  of  conditions.  However, 
there  are  many  reasons  for  moving  or  leaving  the  area  (such 
as  death)  that  are  not  necessarily  pre-planned  by  the  new 
doctor.  Therefore,  further  analysis  dropped  the  LEAVER  vari 
able  although  mobility  tendencies  are  certainly  worthy  of 
further  research. 

In  the  interest  of  producing  a coherent  set  of  analyses 
for  each  of  the  data  years,  the  table  below  summarizes  the 


MDA  for  1921  without  the  LEAVER  variables. 
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TABLE  VI-3 

CLUSTER  AND  DISCRIMINANT  ANALYSES  RESULTS  FOR  NEW  DOCTORS:  1921 

WITH  LEAVER  VARIABLES  DROPPED 


Cluster  Analysis: 

No  Change  from  Previous  Tables 


Standardized 
FUNC  1 

Discriminant 
FUNC  2 

Function  Coefficients 
FUNC  3 

FUNC  4 

LICDATE 

-.24 

.32 

.62 

.65 

BDATE 

1.01 

.31 

-.59 

-.46 

SPEC  1 

-.49 

.66 

-.33 

-.77 

SPEC  2 

.01 

.16 

.52 

-1.04 

STATE  1 

.22 

-.69 

-.44 

-.06 

Canonical 

Correlation 

.26 

.16 

.11 

.04 

Cluster 

Centroids 
FUNC  1 

of  Clusters 
FUNC  2 

in  Reduced  Space 
FUNC  3 

FUNC  4 

1 

.04 

.22 

.00 

.02 

2 

.14 

-.08 

.12 

-.03 

3 

.03 

-.17 

.02 

4 

-.41 

.09 

-.19 

-.09 

5 

-1.54 

-.15 

.30 

.06 

Prediction  Results:  35.4%  Correctly  Classified 
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1921  New  Doctors  without  LEAVER  Variables 

The  prediction  level  has  fallen  somewhat  from  that  in 
the  previous  results  although  it  is  still  significantly 
better  than  uninformed  guessing.  The  makeup  of  the  discrim- 
inant functions  has  changed  substantially,  notably  in  the 
first  function  where  BDATE  is  now  by  far  the  most  dominant 
variable.  This  indicates  that  BDATE  is  significantly  rela- 
ted to  a doctor's  likelihood  of  locating  in  the  outer  sub- 
urbs or  at  Metropolitan  Hospital . The  other  discriminant 
functions  seem  a rather  mixed  bag  and  their  role  in  discrim- 
inating among  clusters  is  reduced  from  the  previous  analysis. 

1931  Results  of  Cluster  and  Discriminant  Analyses 

Table  VI-4  indicates  that  the  analyses  for  1931  are 
much  more  striking  than  those  for  1921.  The  cluster  anal- 
ysis for  1931  produces  clusters  very  similar  to  those  for 
1921  although  most  have  grown  slightly  in  numbers  of  doctors. 

The  MDA  results  are  significant  because  all  of  the  first 
three  functions  appear  to  be  relatively  important  in  discrim- 
inating among  the  clusters  of  doctors.  FUNC  1 includes  in 
particular  LICDATE,  STATE  1,  STATE  2,  and  probably  SPEC  3. 
This  function  is  especially  useful  for  separating  Clusters  1 
and  4 from  the  others.  Clusters  1 and  4 together  may  be 
called  the  Inner  City  Cluster;  therefore,  it  appears  that 
Inner  City-1931  new  doctors  are  distinguished  by  having  been 
licensed  recently,  a smaller  than  usual  likelihood  of  being 
from  the  U.  S.,  and  probably  by  an  increased  likelihood  of 
being  a third-level  specialist. 
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TABLE  VI-4 

CLUSTER  AND  DISCRIMINANT  ANALYSES  RESULTS  FOR  NEW  DOCTORS;  1931 


Cluster  Analysis  (See  Figure 

38) 

Cluster 

Size 

Circle  Radius 

Verbal  Description 

1 

61 

1.84 

Inner  City 

2 

27 

1.87 

Outer  City 

3 

30 

1.83 

Lakewood 

4 

13 

1.24 

Metropolitan 

Hospital 

5 

7 

2.70 

Outer  Suburban 

138 

Standardized  Discriminant  Function  Coefficients 

FUNC  1 FUNC  2 FUNC  3 FUNC  4 


LICDATE 

-.75 

1.17 

1.02 

-.71 

BDATE 

-.07 

-.93 

-.93 

.38 

SPEC  1 

-.39 

-.73 

.48 

.36 

SPEC  2 

-.20 

.11 

-.15 

.59 

STATE  1 

1.01 

-.12 

1.00 

-.67 

STATE  2 

.96 

.27 

1.30 

.10 

Canonical 

Correlation 

.46 

.35 

.32 

.08 

Cluster 

Centroids 
FUNC  1 

of  Clusters 
FUNC  2 

in  Reduced  Space 
FUNC  3 

FUNC  4 

-.33 

-.07 

-.25 

.02 

.11 

-.13 

.50 

.09 

.43 

.55 

-.04 

-.03 

-.46 

-.19 

.43 

-.20 

1.41 

-.94 

-.36 

-.06 

Prediction  Results:  51.4%  Correctly  Classified 
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FUNC  2,  particularly,  appears  to  be  important  as  sepa- 
rating Lakewood,  or  Inner  Suburban  doctors,  from  Outer  Sub- 
urban doctors.  On  the  basis  of  the  coefficients  and  cen- 
troids it  appears  that  new  doctors  in  Lakewood  have  newer 
licenses  but  are  older  whereas  new  doctors  in  the  Outer 
Suburbs  are  younger  with  older  licenses. 

FUNC  3,  which  is  somewhat  similar  in  makeup  to  FUNC  2 
with  the  addition  of  STATE  1 and  STATE  2,  appears  to  dif- 
ferentiate what  might  be  called  the  middle  city  area  from 
the  others.  Apparently,  new  doctors  with  new  licenses  and 
who  are  from  the  U.  S.  are  more  likely  to  locate  in  this 
zone  . 

1940  Results  of  Cluster  and  Discriminant  Analyses 

The  results  for  1940  are  not  as  clear-cut  as  those  for 
1931.  The  clusters  are  in  similar  locations  although  the 
Inner  City  cluster  is  much  smaller  and  the  Outer  City  clus- 
ter appears  to  be  growing  someWihat  at  the  expense  of  the  In- 
ner City  cluster. 

FUNC  1 from  the  MDA  results  is  clearly  dominated  by 
LICDATE  and  probably  by  STATE  3.  Since  FUNC  1 is  clearly 
most  useful  for  separating  the  Inner  City  cluster  from  the 
others,  particularly  the  Outer  City  and  Outer  Suburban,  it 
appears  that  the  new  doctors  with  older  licenses  are  not 
tending  to  concentrate  in  the  inner  city.  Since  these  doc- 
tors are  also  more  likely  to  be  foreign-trained , it  may  well 
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TABLE  VI-5 

CLUSTER  AND  DISCRIMINANT  ANALYSES  RESULTS  FOR  NEW  DOCTORS:  1940 


Cluster  Analysis  (See  Figure 

39) 

Cluster 

Size 

Circle  Radius 

Verbal  Description 

1 

22 

2.02 

Inner  City 

2 

30 

1.34 

Lakewood 

3 

26 

1.69 

Metropolitan 

Hospital 

4 

21 

1.95 

Outer  City 

5 

4 

5.17 

Outer  Suburban 

103 

Standardized  Discriminant  Function  Coefficients 

FUNC  1 FUNC  2 FUNC  3 FUNC  4 


LICDATE 

1.27 

-.55 

.51 

1.06 

BDATE 

-.75 

.63 

-.70 

-1.50 

SPEC  1 

-.57 

.40 

.60 

.28 

SPEC  2 

-.60 

-.25 

.26 

-.20 

STATE  1 

-.26 

.92 

-1.99 

1.37 

STATE  2 

-.05 

1.46 

-1.64 

1.07 

Canonical 

Correlation 

.40 

.35 

.25 

.13 

Cluster 

Centroids 
FUNC  1 

of  Clusters 
FUNC  2 

in  Reduced  Space 
FUNC  3 

FUNC  4 

1 

-.67 

.32 

.05 

-.03 

2 

.06 

-.12 

-.33 

.09 

3 

.00 

-.49 

.20 

-.08 

4 

.47 

.34 

.25 

.09 

5 

.72 

.54 

-.39 

-.52 

Prediction  Results:  38.8%  Correctly  Classified 
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be  that  older  licenses  and  foreign  training  are  part  of  the 
same  package . 

FUNC  2 is  dominated  by  U.  S. -trained  doctor  variables 
and  is  a good  discriminator  of  Metropolitan  Hospital-cen- 
tered doctors  from  the  other  groups.  It  appears  that  al- 
ready by  1940  f oreign-trained  doctors  were  beginning  to  play 
an  important  role  in  providing  specialist  manpower  to  the 
huge  Metropolitan  Hospital  and  surrounding  area. 

1940  Results  of  Cluster  and  Discriminant  Analyses;  New 

Doctors  with  Only  Two  Groups 

A number  of  other  analyses  were  performed  on  1940  data 
with  various  groupings  of  original  clusters.  Limitations  of 
space  and  reader  interest  militate  against  displaying  all 
results  here.  However,  the  results  obtained  when  the  clus- 
ters were  combined  into  just  two  groups  are  meaningful  and 
are  reproduced  below. 

Although  the  prediction  results  here  appear  high,  it 
should  be  remembered  that  with  two  classes,  random  guessing 
should  average  50%  correct  classification.  Nevertheless, 
the  MDA  results  indicate  clearly  that  in  1940  Inner  City 
new  doctors  tended  to  have  a combination  of  older  licenses 


and  overseas  training. 
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TABLE  VI-6 

CLUSTER  AND  DISCRIMINANT  ANALYSES  RESULTS  FOR  NEW  DOCTORS;  1940 

TWO  GROUPS  ONLY 


Group 


Cluster  Analysis 
Verbal  Description 


1 Inner  City  Group:  consists  of  original  1940  Clusters  1 

and  2 

2 Outer  City-Suburban  Group:  consists  of  all  other  1940 

Cluster 


Standardized  Discriminant  Function  Coefficients 
FUNC  1 


LICDATE 

CO 

CO 

BDATE 

-.41 

SPEC  1 

-.53 

SPEC  2 

-.72 

STATE  1 

1.01 

STATE  2 

1.21 

Canonical 

.34 

Correlation 


Centroids  of  Groups  in  Reduced  Space 
Group  FUNC  1 


1 


-.36 


.31 


Prediction  Results:  65.0%  Correctly  Classified 
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TABLE  VI-7 

CLUSTER  AND  DISCRIMINANT  ANALYSES  RESULTS  FOR  NEW  DOCTORS:  1950 


Cluster  Analysis  (See  Figure  40) 


Cluster 

Size 

Circle  Radius 

Verbal  Description 

1 

41 

1.81 

Inner  City  and  Metro- 

politan  Hospital 

2 

47 

1.96 

Lakewood 

3 

4 

0.00 

Bay  Village 

4 

24 

2.43 

Middle  Western 

Cleveland 

5 

14 

2.27 

Southwestern  Outer 

130 

Suburbs 

Standardized 
FUNC  1 

Discriminant 
FUNC  2 

Function  Coefficients 
FUNC  3 

FUNC  4 

LICDATE 

.19 

-.56 

-3.00 

-.67 

BDATE 

-.05 

-.09 

2.61 

1.39 

SPEC  1 

-.36 

-.52 

.06 

-.55 

SPEC  2 

.32 

-.08 

-.16 

.21 

STATE  1 

.87 

-.52 

.86 

-1.07 

STATE  2 

1.21 

-.17 

.60 

-1.26 

Canonical 

Correlation 

.42 

.32 

.19 

.06 

Cluster 

Centroids 
FUNC  1 

of  Clusters 
FUNC  2 

in  Reduced  Space 
FUNC  3 

FUNC  4 

1 

-.42 

.31 

-.01 

-.03 

2 

.44 

.07 

.14 

.02 

3 

-.38 

-1.14 

.41 

-.23 

4 

-.37 

-.37 

-.06 

.09 

5 

.48 

-.16 

-.44 

-.06 

Prediction  Results:  47.7%  Correctly  Classified 
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1950  Results  of  Cluster  and  Discriminant  Analyses 

Between  1940  and  1950  substantial  shifts  occur  in  the 
clusters.  The  Inner  City  and  Metropolitan  Hospital  clusters 
have  amalgamated  into  one  large  cluster  centered  primarily 
on  the  hospital . The  Outer  City  cluster  continues  to  move 
toward  the  Southwest  and  the  Lakewood  cluster  has  grown  con- 
siderably in  number  of  doctors.  The  Outer  Suburban  cluster 
has  split  into  two  parts  with  one  small  cluster  at  Bay  Village 
and  a larger  one  in  the  southwest  suburban  area  around  Fair- 
view  Park  and  North  Olmsted.  Several  of  the  clusters  appear 
to  be  growing  in  size  reflecting  dispersion  of  doctors. 

FUNC  1 from  the  MDA  is  dominated  by  the  doctor's  origin 
variables  and  is  the  best  discriminator  of  the  Lakewood-Fair- 
view  Park  new  doctors  from  the  other  clusters . New  doctors 
in  these  clusters  are  more  likely  to  be  from  the  U.  S.  than 
those  in  other  clusters. 

The  other  functions  produce  mixed  results  of  varying 
importance  although  it  is  interesting  to  note  that  FUNC  3, 
dominated  by  LICDATE  and  BDATE , is  a good  discriminator  be- 
tween Southwestern  Outer  Suburbs  and  Bay  Village.  It  is 
similar  in  makeup  to  the  function  for  1940  which  discrimina- 
ted between  Lakewood  and  other  suburban  doctors.  However, 
in  both  cases  at  least  one  of  the  discriminated  clusters  was 


quite  small  so  caution  must  be  taken  in  drawing  conclusions. 
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TABLE  VI-8 

CLUSTER  AND  DISCRIMINANT  ANALYSES  RESULTS  FOR  NEW  DOCTORS:  1960 


Cluster 

Size 

Cluster  Analysis  (See  Figure 
Circle  Radius 

41) 

Verbal  Description 

1 

68 

1.24 

Lakewood 

2 

36 

2.04 

Inner  City  and  Metro- 
politan Hospital 

3 

37 

2.70 

Southwestern  Cleveland 
and  Fairview  Park 

4 

32 

1.70 

Middle  Western 
Cleveland 

5 

9 

182 

3.94 

Far  Western  Siiburbs 

Standardized 

Discriminant 

Function  Coefficients 

FUNC  1 

FUNC  2 

FUNC  3 

FUNC  4 

LICDATE 

-.33 

-.12 

-1.25 

2.34 

BDATE 

.57 

.92 

.63 

-2.28 

SPEC  1 

-.71 

.66 

.27 

-.22 

SPEC  2 

.06 

-.12 

-.15 

-.09 

STATE  1 

.43 

.53 

1.21 

.15 

STATE  2 

.15 

.46 

1.13 

.85 

Canonical 

.43 

.27 

.19 

:o7 

Correlation 

Centroids 

of  Clusters 

in  Reduced  Space 

Cluster 

FUNC  1 

FUNC  2 

FUNC  3 

FUNC  4 

.34 

-.11 

.16 

-.02 

-.62 

-.11 

.12 

.08 

.38 

.05 

-.27 

.06 

-.44 

-.04 

-.21 

-.10 

-.11 

1.17 

.18 

-.01 

Prediction  Results:  42.3%  Correctly  Classified 
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1960  Results  of  Cluster  and  Discriminant  Analyses 

The  cluster  analysis  of  the  1960  new  doctor  data  pro- 
duced clusters  similar  in  location  to  those  of  1950.  The 
continuing  decline  of  the  Metropolitan  Hospital-Inner  City 
group  is  evident.  On  the  other  hand,  the  number  of  doctors 
in  the  Lakewood  cluster  has  increased  substantially  as  have 
all  of  the  clusters  in  the  western  city  and  suburbs.  The 
Fairview  Park  and  far  southwestern  part  of  Cleveland  has 
become  a major  center  for  doctors'  practices.  The  increase 
of  doctors  in  far  western  North  Olmsted  has  combined  with 
those  in  Bay  Village  to  produce  a small  but  growing  Far 
Western  group. 

The  MDA  results  indicate  that  the  zone  of  practices 
extending  from  Lakewood  and  Rocky  River  through  Fairview  Park 
and  far  southwestern  Cleveland  is  strongly  differentiated 
from  the  other  areas  on  the  basis  of  FUNC  1.  FUNC  1 has  as 
its  major  components  SPEC  1 and  probably  STATE  3.  The  coef- 
ficients and  centroids  indicate  that  this  area  is  character- 
ized by  doctors  who  are  more  likely  to  be  third  level  speci- 
alists and  less  likely  to  be  from  overseas  than  in  other  areas 

FUNC  2 is  very  useful  for  differentiating  the  Far  West- 
ern cluster  from  the  others  and  its  composition  indicates  that 
doctors  in  this  area  are  characterized  by  their  relative  youth 

FUNC  3 separates  what  might  be  called  the  Southwestern 
Cleveland  group  of  clusters  from  the  others  and  indicates 
that  new  doctors  in  this  area  are  characterized  by  being 
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TABLE  VI-9 

CLUSTER  AND  DISCRIMINANT  ANALYSES  RESULTS  FOR  NEW  DOCTORS:  1969 


Cluster  Analysis  (See  Figure  42) 


Cluster 

Size 

Circle  Radius 

Verbal  Description 

1 

34 

2.34 

Inner  City  and  Metro- 

politan  Hospital 

2 

32 

1.69 

Rocky  River  Area 

3 

40 

1.75 

Lakewood  and  West 

Central  Cleveland 

4 

17 

3.52 

Far  Western  Suburbs 

5 

23 

1.54 

Southwestern  Cleveland 

146 

Standardized  Discriminant  Function  Coefficients 


FUNC  1 

FUNC  2 

FUNC  3 

FUNC  4 

LICDATE 

1.22 

.03 

1.69 

-1.57 

BDATE 

-.70 

.14 

-1.74 

2.10 

SPEC  1 

-.50 

-.28 

-.36 

.04 

SPEC  2 

-.52 

.77 

-.21 

-.16 

STATE  1 

-.51 

.03 

.69 

.63 

STATE  2 

-.36 

.29 

.69 

-.03 

Canonical 

Correlation 

.38 

.25 

.21 

.11 

Cluster 

Centroids  of 
FUNC  1 

Clusters 
FUNC  2 

in  Reduced  Space 
FUNC  3 

FUNC  4 

.18 

.04 

-.15 

-.19 

.20 

.37 

.19 

.05 

-.59 

-.08 

.07 

-.00 

.51 

-.50 

.27 

.02 

.10 

-.05 

-.37 

.20 

Prediction  Results:  34.2%  Correctly  Classified 
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newly  licensed  and  having  attended  medical  school  somewhere 
in  the  United  States.  FUNC  4 has  little  importance  as  a 
discriminator. 

1969  Results  of  Cluster  and  Discriminant  Analyses 

The  attempt  to  produce  five  clusters  for  the  1969  new 
doctors  appears  somewhat  less  satisfactory  visually  than  the 
earlier  cluster  analyses.  There  seems  to  be  a well-defined, 
if  looser.  Far  Western  Suburban  cluster  but  the  Lakewood- 
Western  City-Inner  Suburban  area  is  very  complex.  The  at- 
tempt to  define  clusters  in  this  area  appears  rather  arbi- 
trary. This  situation  probably  reflects  the  increased  im- 
portance of  this  area  in  1969  as  a location  for  new  doctors. 
Two-thirds  of  all  new  doctors  in  the  study  area  are  in  this 
group  of  clusters  and  it  appears  that  the  Lakewood  to  Fair- 
view  Park  area,  which  might  be  termed  the  Rocky  River  cor- 
ridor, has  become  the  "medical  center"  for  the  western  part 
of  Cuyahoga  County  and  Cleveland  in  196^^^at  least  so  far  as 
new  doctors  are  concerned. 

At  least  partially  as  a result  of  the  increasing  com- 
plications in  the  cluster  analysis,  the  MDA  for  1969  also 
appears  to  be  rather  less  satisfactory  than  for  the  earlier 
years.  Certainly  the  prediction  results  are  unspectacular. 
Notwithstanding  these  problems,  there  are  some  significant 
results  in  the  MDA.  FUNC  1 clearly  differentiates  the  Lake- 
wood  cluster  from  the  others  and  shows  that  new  doctors  in 
Lakewood  are  distinguished  by  their  older  licenses. 
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FUNG  2 is  fairly  good  at  separating  doctors  in  the  Far 
Western  Suburbs  from  the  others,  particularly  those  in  the 
Rocky  River  area.  It  appears  that  Far  Western-Suburban  doc- 
tors are  seldom  second-level  specialists  but  are  somewhat 
more  frequently  from  overseas  than  doctors  in  other  areas. 

Both  FUNG  3 and  FUNG  4 are  dominated  by  the  inverse  re- 
lationship between  LIGDATE  and  BDATE.  Earlier  it  was  noted 
that  BDATE  and  LIGDATE  tend  to  have,  as  expected,  a high 
positive  correlation  among  all  doctors.  However,  there  ap- 
pear to  be  some  areas  in  which  this  relationship  is  changed. 
Presumably,  it  is  changed  at  least  partially  because  of  the 
presence  of  many  f ore ign- trained  doctors.  In  addition  it  is 
likely  that  it  is  affected  by  specialist-trained  doctors  who 
may  be  licensed  at  an  older  age. 

Whatever  the  reason--and  more  research  should  be  done 
in  this  area--there  are  certain  clusters  that  appear  to  be 
distinguished  from  the  others  on  the  basis  of  a changed  rela- 
tionship between  BDATE  and  LIGDATE.  In  1969  FUNG  3 indicates 
that  Glusters  2,  3,  and  4 were  distinguished  from  the  others 

by  an  inclination  of  new  doctors  therein  to  be  more  recently 
licensed  but  older  than  expected.  This  differentiation,  how- 
ever, is  noticeable  only  after  that  due  to  the  first  two  dis- 
criminant functions  has  been  calculated  and  it  should,  there- 


fore, be  treated  with  caution. 
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Summary  of  Cluster  and  Discriminant 
Analyses  Results  for  All  Data  Years 

The  quantity,  variety,  and  richness  of  the  results  ob- 
tained from  the  combination  of  cluster  and  discriminant  anal- 
yses do  not  lend  themselves  to  simple  summary.  This  fact, 
itself,  is,  at  once,  confusing  and  informative.  It  is  con- 
fusing in  that  it  is  difficult  to  unravel  the  trends  and 
generalizations  that  may  be  drawn  from  the  data.  It  is  in- 
formative because  it  suggests  that  these  trends  and  general- 
izations may,  at  best,  be  misleading  and,  at  worst,  be  false. 
It  is,  therefore,  probably  best  to  let  the  reader  form  his 
or  her  own  interpretations  from  the  results  recorded  in 
Tables  VI-1  through  VI-9.  Nevertheless,  it  is  worthwhile  to 
pursue  a few  threads  among  the  results  in  hopes  that  they 
will  lead  the  reader  to  the  discovery  of  longer  and  stronger 
one  s . 

Cluster  analysis  consistently  produced  clusters  in  the 
Inner  City,  Lakewood,  and  Far  Western  Suburbs  areas.  The 
Inner  City  clusters  have  gradually  declined  in  importance 
over  the  time  of  the  study  and  have  become  centered  on 
Cleveland  Metropolitan  General  Hospital.  The  Lakewood  clus- 
ters have  remained  remarkably  stable  over  the  years  and  the 
Far  Western  Suburbs  clusters  have  steadily  increased  in  im- 
portance. The  period  since  1940  has  seen  the  constant  growth 
of  what  might  be  called  the  Rocky  River  zone  of  new  doctors' 
offices  extending  from  Lakewood  to  Fairview  Park.  This  area 
increasingly  has  come  to  dominate  the  western  section  of  the 


Cleveland  SMSA. 
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Summary  of  the  discriminant  analysis  results  is  much 
more  difficult  and  much  more  open  to  debate.  The  conclusion 
best  supported  by  the  results  is  that  the  LICDATE  and  STATE 
variables  are  the  most  important  in  discriminating  among 
groupings  over  the  years  included  in  the  study  period. 

Table  VI-10  shows  in  summary  form  some  of  the  MDA  re- 
sults. Only  the  first  discriminant  function  and  variables 
with  very  high  loadings  on  that  function  are  included.  In 
addition  the  table  indicates  which  groupings  the  first  dis- 
criminant function  appears  to  discriminate  best.  Although 
it  indicates  great  variability  over  the  study  period,  Table 
VI-10  shows  a tendency  for  the  groupings  best  discriminated 
to  change  gradually  from  an  Inner  City  versus  Outer  City 
division  to  a Rocky  River  Zone  as  opposed  to  other  areas 
dichotomy . 

The  trend  in  discriminating  variables  is  more  obscure. 
The  I^CDATE  and  STATE  variables  appear  to  compete  for  domi- 
nance but  neither  ultimately  is  more  important  than  the 
other  in  the  research  area  during  the  years  covered  by  the 


study . 


TABLE  VI-10 

RESULTS  OF  DISCRIMINANT  ANALYSIS  FOR  ALL  DATA  YEARS 


Year 

Variables  with  High 
Loadings  on  First 
Discriminant  Function 

Groupings  Best  Discriminated 

1921 

BDATE 
SPEC  3 
STATE  2 

Far  Suburban  and  Metropolitan 
Hospital  vs.  All  Others 

1931 

STATE  1 
STATE  2 
LICDATE 

Outer  Suburban  and  Lakewood  vs . 
All  Others 

1940 

LICDATE 
BDATE 
(STATE  3) * 

Inner  City  vs.  Outer  Suburban 

1950 

STATE  1 
STATE  2 

Lakewood  and  Southwestern  Sii)urbs 
vs.  All  Others 

1960 

SPEC  1 
(STATE  3) 

Cleveland  vs.  The  Suburbs 

1969 

LICDATE 

BDATE 

Lakewood  and  Western  Cleveland  vs. 
All  Others 

^Parentheses  indicate  dummy 

variables  whose  presence  has  been  in- 

ferred  from  the  loadings  on  related  dummy  variables . 


CHAPTER  VII 
SUMMARY  AND  CRITIQUE 

As  stated  in  the  introductory  chapter,  and  modified  in 
the  subsequent  analyses,  this  study  has  had  two  primary  goals. 
The  first  is  to  investigate  the  location  patterns  of  physi- 
cians in  the  western  part  of  the  Cleveland  SMSA  over  an  ex- 
tended period  (1910-1970)  in  relation  to  the  backgrounds  and 
selected  characteristics  of  those  physicians.  The  second, 
and  interrelated,  goal  has  been  to  examine  and  evaluate  a 
number  of  methods  which  appear  promising  for  the  analysis  of 
historical  data  that  are  large  in  quantity,  but  indetermi- 
nate in  quality. 

To  make  a small  but  meaningful  step  in  the  direction  of 
realizing  these  goals,  it  was  necessary  continually  to  mod- 
ify, refashion,  or  redefine  the  presumptions  or  hypotheses 
to  reflect  the  limitations  of  available  data,  time,  and  other 
resources.  It  is  hoped  that  the  original  goals  of  understan- 
ding physician-location  decisions  and  finding  methods  that 
contribute  to  furthering  that  understanding  have  remained 
unchanged  in  their  essence. 

The  western  part  of  the  Cleveland  SMSA,  including  the 
west  side  of  Cleveland  city  and  several  suburbs  in  the  wes- 
tern part  of  Cuyahoga  County,  was  chosen  as  the  study  area. 
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Data  on  the  backgrounds  and  specialties  of  all  physicians 
practicing  in  the  area  in  the  period  1910  to  1970  were  ex- 
tracted from  the  American  Medical  Association  directories 
of  licensed  physicians  for  the  years  which  best  corresponded 
to  the  census  years  of  this  period.  This  included  direc- 
tories for  the  years  1912,  1921,  1931,  1940,  1950,  1960,  and 

1969.  The  data  gathered  on  each  physician  included  name, 
office  address,  birthdate,  medical  school,  license  date,  and 
specialty.  On  the  basis  of  addresses  geographic  coordinates 
were  calculated  for  each  physician  for  each  data  year  and 
codes  were  added  indicating  whether  or  not  he  or  she  re- 
mained in  the  study  area  in  the  succeeding  year. 

On  the  basis  of  other  studies  and  preliminary  examina- 
tion of  the  data,  a first  presumption  was  made  that  the 
physician's  practice  location  was  a function  of  his  or  her 
background  characteristics.  A second  presumption  was  made 
that  the  parameters  of  the  relationship  between  background 
characteristics  and  practice  location  changed  during  the 
study  period.  These  presumptions  were  tested  and  somewhat 
modified  through  several  analyses  including  visual  examina- 
tion of  plots.  Automatic  Interaction  Detection,  fagtor  with 
regression  analysis,  canonical  correlation  analysis,  and 
finally  cluster  with  discriminant  analysis.  The  results  of 
these  analyses  are  summarized  at  the  end  of  each  respective 
chapter.  The  present  chapter  attempts  to  bring  together  in- 
dividual results  into  those  threads  that  appear  to  run  through 


them . 
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Returning  to  the  original  presumptions  (pages  38  and 
39) , they  appear  as: 

Presumption  I: 

LOG  =/(YEAR,  BDATE,  LICDATE,  STATE,  SPEC) 

and 

Presumption  II; 

The  relationship  between  LOG  and  the  independent 
variables  in  the  above  equation  has  changed  over  time . 

There  seems  little  doubt  that  the  first  presumption  is  accep 
table  for  the  study  area  and  time  on  the  basis  of  Rescher's 
rules  for  acceptance  of  presumptions  (see  page  24) . Each 
and  every  one  of  the  analytical  techniques  indicated  a sta- 
tistically significant  relationship  between  location — how- 
ever it  was  measured — and  the  prediction  variables.  Gen- 
erally, the  more  "sophisticated"  the  method  of  measuring 
location,  the  more  significant  the  results.  Given  the  diver 
sity  of  tests  and  the  variations  from  year  to  year  in  the 
samples,  there  seems  to  be  strong  support  for  the  first  pre- 
sumption according  to  both  Rescher's  and  Tukey's  guidelines 
for  acceptance. 

A simple  test  of  this  presumption  was  made  by  using  the 
cluster  and  discriminant  analyses  results  for  prediction 
(see  Chapter  VI).  These  results  were  satisfactory  if  not 
spectacular.  By  all  indications  and  for  all  data  years  the 
predictions  were  significantly  better  than  uninformed  gues- 
sing would  have  produced.  Hence,  the  first  presumption  may 


137 


be  given  a tentative  acceptance  as  worthy  of  further  testing 
in  other  areas  and  with  additional  data. 

The  second  presumption  at  first  inspection  also  seems 
to  have  considerable  evidence  to  support  it  in  the  study 
area  during  the  time  period  covered  by  the  research.  Every 
one  of  the  tests  from  the  visual  examination  of  the  doctors' 
maps  to  the  discriminant  analysis  indicates  considerable 
change  over  time  in  the  relationship  between  location  and 
the  prediction  variables. 

Most  of  the  tests  confirm  the  visual  impression  that 
the  period  after  the  second  World  War  witnessed  a vastly  in- 
creased dispersion  in  the  distribution  pattern  of  doctors  on 
Cleveland's  west  side.  The  AID  and  discriminant  analyses 
results  indicate  that  the  relationship  between  location  and 
the  predictor  variables  was  not  as  strong  prior  to  this  time 
as  it  was  afterwards.  Part  of  this  may  be  caused  by  the 
fact  that  the  techniques  were  unable  to  separate  into  groups 
doctors  who  were  tightly  clustered  before  1940. 

All  the  techniques  indicate  LICDATE,  or  date  of  doctor's 
licensing,  as  the  most  important  determinant  of  location 
over  time.  Moreover,  results  from  AID  and  regression  anal- 
yses indicate  that  it  has  been  most  important  in  the  years 
since  1960.  On  the  other  hand,  the  results  from  discrimi- 
nant analysis  are  not  as  clear-cut  on  the  changing  impor- 
tance of  LICDATE.  Closer  examination  of  all  the  results 
suggests  that  a more  complex  interpretation  may  be  made  as 


follows : 
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1.  LICDATE  has  been  important  at  all  times  in  differ- 
entiating doctors  on  the  basis  of  their  distance 
from  the  CBD . 

2.  Since  1940  doctors'  locations  have  expanded  dra- 
matically outward  from  the  inner  city. 

3.  The  above  two  statements  provide  the  bases  for  the 
findings  of  the  techniques  which  measure  location 
as  simply  DISTCBD.  DISTCBD  is  very  much  a func- 
tion of  LICDATE. 

4.  The  location  of  doctors  measured  by  both  XCOOR  and 
YCOOR  has  become  much  more  complex  since  1940. 
There  appears  to  be  a new  core  of  doctors  which  is 
centered  along  the  Rocky  River. 

5 . Techniques  which  measure  location  as  a multivari- 
ate give  much  more  importance  to  other  determi- 
nants such  as  STATE  and  SPEC. 

In  short,  evidence  in  support  of  the  second  presumption 
is  less  overwhelming  than  that  in  the  case  of  the  first. 

Much  of  the  apparent  change  in  the  relationship  seems  to  be 
caused  by  the  fact  that  certain  of  the  various  techniques 
are  better  at  differentiating  some  types  of  data  while 
others  are  superior  in  handling  different  types.  Neverthe- 
less, all  the  techniques  indicate  that  there  is  change  oc- 
curring but  that  the  direction  of  the  change  is  difficult  to 
discern.  More  work  needs  to  be  done  to  establish  exactly 


the  directions  in  which  the  changes  are  proceeding. 
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This  raises  the  question  of  evaluation  of  the  techniques 
themselves.  Each  of  the  previous  chapters  presents  the  tech- 
niques in  the  appropriate  research  situation.  Certain  of  the 
techniques  hold  especial  promise  for  use  in  geography.  Auto- 
matic Interaction  Detection  seems  to  have  suffered  unwarran- 
ted neglect  by  geographers  as  a useful  device  for  initial 
examination  of  large  data  sets.  If  the  computer  program  is 
available,  AID  is  extremely  easy  to  use  and  the  results, 
particularly  in  the  form  of  the  tree  diagrams,  can  illumi- 
nate many  research  problems.  Although  small  data  sets  pre- 
sent some  limitations,  they  do  not  seem  as  serious  as  Sonquist 
(1971)  indicates,  providing  the  analyses  are  replicated  with 
differing  samples. 

Although  factor  and  rearession  analyses  did  not  prove 
especially  fruitful  in  the  present  study,  these  techniques 
are  well  established  and  of  sufficient  diversity  to  be  use- 
ful in  most  research  problems  if  they  are  handled  with  care 
and  the  results  interpreted  cautiously. 

Conversely,  canonical  correlation  and  cluster  analysis 
together  with  discriminant  analysis  have  considerable  advan- 
tage for  use  in  geographical  and  historical  research  situa- 
tions . Many  research  problems  include  dependent  or  crite- 
rion variables  which  should  be  multivariate  in  form.  Fre- 
quently, the  researcher  dispenses  with  the  extra  dimensions 
or  reduces  them  by  factor  analysis.  Such  operations  may  be 
adequate  for  some  research  situations.  But  the  new  techniques 
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that  do  handle  multivariate  criterion  variables  provide  a 
viable  and,  perhaps,  better  alternative  in  many  situations. 

The  effort  required  for  interpretation  may  be  substan- 
tial but  in  most  cases  it  will  be  rewarded  by  an  increased 
depth  of  understanding.  The  results  of  the  earlier  analyses 
in  this  study  barely  indicate  the  abundance  that  may  be  ob- 
tained. Other  varieties  of  techniques  which  handle  multi- 
variate criteria  but  which  were  not  included  in  the  present 
study,  such  as  Multivariate  Analysis  of  Variance  (MANOVA) 
and  Covariance  (MANCOVA) , also  are  worthy  of  consideration. 

The  more  complex  statistical  techniques  may  be  more 
difficult  to  use  and  may  require  more  time  and  care  in  assem- 
bling the  research  procedure,  but  the  results  in  this  study 
indicate  that  they  produce  very  rich  results.  And  the  com- 
plexity of  their  output  insures  against  the  broad  generali- 
zations that  sometimes  characterize  the  output  of  simpler 
techniques.  In  the  present  state  of  history  and  geography 
(Harvey,  1969)  this  is  not  an  altogether  unfortunate  result. 

Much  more  research  needs  to  be  done  in  Cleveland  (and 
elsewhere)  before  broad,  far-reaching  conclusions  such  as 
those  of  Dewey  (1973)  can  be  made.  The  next  phase  of  the 
present  study  will  include  the  demographic  and  other  vari- 
ables that  define  neighborhoods  in  which  physicians  locate 
their  practices.  The  techniques  applied  in  the  present 
study,  particularly  those  including  multivariate  criteria. 


should  be  even  more  useful  in  such  research. 


141 


The  completion  of  the  next  step  in  this  ongoing  research 

may  increase  understanding  of  why  doctors  locate  where  they 

do--at  least  on  the  west  side  of  Cleveland  during  the  years 

1910  to  1970,  By  that  time  other  changes  will  have  occurred. 

Medical  care  will  have  changed  as  will  geography.  But  all 

researchers,  be  they  geographical,  historical,  or  otherwise, 

who  operate  in  the  temporal  dimension  are  faced  with  the 

same  problem.  As  David  Harvey  (1969,  431)  has  said: 

. . .But  in  seeking  for  explanations  that  stretch  back 

over  time,  we  may  choose  among  a variety  of  modes  ran- 
ging from  the  logically  rigorous  process  model  to  sim- 
ple narrative.  At  the  present  time  it  seems  that  noth- 
ing useful  can  come  from  insisting  that  the  only  admis- 
sible form  of  explanation  is  that  which  is  rigorously 
scientific  and  objective.  We  should,  however,  be  pre- 
pared to  admit  the  problems  inherent  in  using  less  rig- 
orous modes  of  temporal  explanation.  The  problem  is 
not,  therefore,  that  we  fail  to  be  rigorously  scientific 
and  objective,  but  that  we  fail  to  acknowledge  the  re- 
spects in  which  we  have  been  forced  to  compromise  with 
•this  ideal,  and  hence  fail  to  distinguish  between  per- 
missible and  non-permissible  inferences,  given  the  logic 
of  the  situation. 

Neither  time  nor  space  is  an  easy  dimension  with  which 


to  work;  however  we  have  little  choice. 
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